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Foreword 


As the Second World War was coming to its end, John von Neumann, arguably the 
foremost mathematician of that time, was busy initiating two intellectual currents that 
would shape the rest of the twentieth century: game theory and algorithms. In 1944 (16 
years after the minmax theorem) he published, with Oscar Morgenstern, his Games 
and Economic Behavior, thus founding not only game theory but also utility theory and 
microeconomics. Two years later he wrote his draft report on the EDVAC, inaugurating 
the era of the digital computer and its software and its algorithms. Von Neumann wrote 
in 1952 the first paper in which a polynomial algorithm was hailed as a meaningful 
advance. And, he was the recipient, shortly before his early death four years later, of 
Gédel’s letter in which the P vs. NP question was first discussed. 

Could von Neumann have anticipated that his twin creations would converge half 
a century later? He was certainly far ahead of his contemporaries in his conception 
of computation as something dynamic, ubiquitous, and enmeshed in society, almost 
organic — witness his self-reproducing automata, his fault-tolerant network design, and 
his prediction that computing technology will advance in lock-step with the economy 
(for which he had already postulated exponential growth in his 1937 Vienna Colloquium 
paper). But I doubt that von Neumann could have dreamed anything close to the Internet, 
the ubiquitous and quintessentially organic computational artifact that emerged after 
the end of the Cold War (a war, incidentally, of which von Neumann was an early 
soldier and possible casualty, and that was, fortunately, fought mostly with game 
theory and decided by technological superiority — essentially by algorithms — instead 
of the thermonuclear devices that were von Neumann’s parting gift to humanity). 

The Internet turned the tables on students of both markets and computation. It 
transformed, informed, and accelerated markets, while creating new and theretofore 
unimaginable kinds of markets — in addition to being itself, in important ways, a market. 
Algorithms became the natural environment and default platform of strategic decision 
making. On the other hand, the Internet was the first computational artifact that was not 
created by a single entity (engineer, design team, or company), but emerged from the 
strategic interaction of many. Computer scientists were for the first time faced with an 
object that they had to feel with the same bewildered awe with which economists have 


xiii 


xiv FOREWORD 


always approached the market. And, quite predictably, they turned to game theory for 
inspiration — in the words of Scott Shenker, a pioneer of this way of thinking who has 
contributed to this volume, “the Internet is an equilibrium, we just have to identify the 
game.” A fascinating fusion of ideas from both fields — game theory and algorithms — 
came into being and was used productively in the effort to illuminate the mysteries of 
the Internet. It has come to be called algorithmic game theory. 

The chapters of this book, a snapshot of algorithmic game theory at the approximate 
age of ten written by a galaxy of its leading researchers, succeed brilliantly, I think, in 
capturing the field’s excitement, breadth, accomplishment, and promise. The first few 
chapters recount the ways in which the new field has come to grips with perhaps the 
most fundamental cultural incongruity between algorithms and game theory: the latter 
predicts the agents’ equilibrium behavior typically with no regard to the ways in which 
such a state will be reached — a consideration that would be a computer scientist’s 
foremost concern. Hence, algorithms for computing equilibria (Nash and correlated 
equilibria in games, price equilibria for markets) have been one of algorithmic game 
theory’s earliest research goals. This body of work has become a valuable contribu- 
tion to the debate in economics about the validity of behavior predictions: Efficient 
computability has emerged as a very desirable feature of such predictions, while com- 
putational intractability sheds a shadow of implausibility on a proposed equilibrium 
concept. Computational models that reflect the realities of the market and the Internet 
better than the von Neumann machine are of course at a premium — there are chapters 
in this book on learning algorithms as well as on distributed algorithmic mechanism 
design. 

The algorithmic nature of mechanism design is even more immediate: This elegant 
and well-developed subarea of game theory deals with the design of games, with players 
who have unknown and private utilities, such that at the equilibrium of the designed 
game the designer’s goals are attained independently of the agents’ utilities (auctions 
are an important example here). This is obviously a computational problem, and in 
fact some of the classical results in this area had been subtly algorithmic, albeit with 
little regard to complexity considerations. Explicitly algorithmic work on mechanism 
design has, in recent years, transformed the field, especially in the case of auctions 
and cost sharing (for example, how to recover the cost of an Internet service from 
customers who value the service by amounts known only to them) and has become the 
arena of especially intense and productive cross-fertilization between game theory and 
algorithms; these problems and accomplishments are recounted in the book’s second 
part. 

The third part of the book is dedicated to a line of investigation that has come 
to be called “the price of anarchy.” Selfish rational agents reach an equilibrium. The 
question arises: exactly how inefficient is this equilibrium in comparison to an idealized 
situation in which the agents would strive to collaborate selflessly with the common 
goal of minimizing total cost? The ratio of these quantities (the cost of an equilibrium 
over the optimum cost) has been estimated successfully in various Internet-related 
setups, and it is often found that “anarchy” is not nearly as expensive as one might have 
feared. For example, in one celebrated case related to routing with linear delays and 
explained in the “routing games” chapter, the overhead of anarchy is at most 33% over 
the optimum solution — in the context of the Internet such a ratio is rather insignificant 
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and quickly absorbed by its rapid growth. Viewed in the context of the historical 
development of research in algorithms, this line of investigation could be called “the 
third compromise.” The realization that optimization problems are intractable led us to 
approximation algorithms; the unavailability of information about the future, or the lack 
of coordination between distributed decision makers, brought us online algorithms; the 
price of anarchy is the result of one further obstacle: now the distributed decision makers 
have different objective functions. Incidentally, it is rather surprising that economists 
had not studied this aspect of strategic behavior before the advent of the Internet. One 
explanation may be that, for economists, the ideal optimum was never an available 
option; in contrast, computer scientists are still looking back with nostalgia to the 
good old days when artifacts and processes could be optimized exactly. Finally, the 
chapters on “additional topics” that conclude the book (e.g., on peer-to-peer systems 
and information markets) amply demonstrate the young area’s impressive breadth, 
reach, diversity, and scope. 

Books — a glorious human tradition apparently spared by the advent of the Internet — 
have a way of marking and focusing a field, of accelerating its development. Seven 
years after the publication of The Theory of Games, Nash was proving his theorem on 
the existence of equilibria; only time will tell how this volume will sway the path of 
algorithmic game theory. 


Paris, February 2007 Christos H. Papadimitriou 


Preface 


This book covers an area that straddles two fields, algorithms and game theory, and 
has applications in several others, including networking and artificial intelligence. Its 
text is pitched at a beginning graduate student in computer science — we hope that this 
makes the book accessible to readers across a wide range of areas. 

We started this project with the belief that the time was ripe for a book that clearly 
develops some of the central ideas and results of algorithmic game theory — a book that 
can be used as a textbook for the variety of courses that were already being offered 
at many universities. We felt that the only way to produce a book of such breadth in 
a reasonable amount of time was to invite many experts from this area to contribute 
chapters to a comprehensive volume on the topic. 

This book is partitioned into four parts: the first three parts are devoted to core areas, 
while the fourth covers a range of topics mostly focusing on applications. Chapter 1 
serves as a preliminary chapter and it introduces basic game-theoretic definitions that 
are used throughout the book. The first chapters of Parts II and III provide introductions 
and preliminaries for the respective parts. The other chapters are largely independent 
of one another. The authors were requested to focus on a few results highlighting 
the main issues and techniques, rather than provide comprehensive surveys. Most 
of the chapters conclude with exercises suitable for classroom use and also identify 
promising directions for further research. We hope these features give the book the feel 
of a textbook and make it suitable for a wide range of courses. 

You can view the entire book online at 
www.cambridge.org/us/978052 1872829 
username: agtluser 
password: camb2agt 

Many people’s efforts went into producing this book within a year and a half 
of its first conception. First and foremost, we thank the authors for their dedi- 
cation and timeliness in writing their own chapters and for providing important 
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feedback on preliminary drafts of other chapters. Thanks to Christos Papadimitriou 
for his inspiring Foreword. We gratefully acknowledge the efforts of outside review- 
ers: Elliot Anshelevich, Nikhil Devanur, Matthew Jackson, Vahab Mirrokni, Herve 
Moulin, Neil Olver, Adrian Vetta, and several anonymous referees. Thanks to Cindy 
Robinson for her invaluable help with correcting the galley proofs. Finally, a big 
thanks to Lauren Cowles for her stellar advice throughout the production of this 
volume. 
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PART ONE 


Computing in Games 


CHAPTER 1 


Basic Solution Concepts and 
Computational Issues 


Eva Tardos and Vijay V. Vazirani 


Abstract 


We consider some classical games and show how they can arise in the context of the Internet. We also 
introduce some of the basic solution concepts of game theory for studying such games, and some 
computational issues that arise for these concepts. 


1.1 Games, Old and New 


The Foreword talks about the usefulness of game theory in situations arising on the 
Internet. We start the present chapter by giving some classical games and showing 
how they can arise in the context of the Internet. At first, we appeal to the reader’s 
intuitive notion of a “game”; this notion is formally defined in Section 1.2. For a more 
in-depth discussion of game theory we refer the readers to books on game theory such 
as Fudenberg and Tirole (1991), Mas-Colell, Whinston, and Green (1995), or Osborne 
and Rubinstein (1994). 


1.1.1 The Prisoner’s Dilemma 


Game theory aims to model situations in which multiple participants interact or affect 
each other’s outcomes. We start by describing what is perhaps the most well-known 
and well-studied game. 


Example 1.1 (Prisoners’ dilemma) Two prisoners are on trial for a crime and 
each one faces a choice of confessing to the crime or remaining silent. If they 
both remain silent, the authorities will not be able to prove charges against them 
and they will both serve a short prison term, say 2 years, for minor offenses. If 
only one of them confesses, his term will be reduced to | year and he will be used 
as a witness against the other, who in turn will get a sentence of 5 years. Finally 
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if they both confess, they both will get a small break for cooperating with the 
authorities and will have to serve prison sentences of 4 years each (rather than 5). 

Clearly, there are four total outcomes depending on the choices made by each 
of the two prisoners. We can succinctly summarize the costs incurred in these 
four outcomes via the following two-by-two matrix. 


P2 
PI Confess Silent 
4 5 
Confess 
4 1 
1 2 
Silent 
5 2, 


Each of the two prisoners “P1” and “P2” has two possible strategies (choices) 
to “confess” or to remain “silent.” The two strategies of prisoner P1 correspond to 
the two rows and the two strategies of prisoner P2 correspond to the two columns 
of the matrix. The entries of the matrix are the costs incurred by the players in 
each situation (left entry for the row player and the right entry for the column 
player). Such a matrix is called a cost matrix because it contains the cost incurred 
by the players for each choice of their strategies. 

The only stable solution in this game is that both prisoners confess; in each 
of the other three cases, at least one of the players can switch from “silent” to 
“confess” and improve his own payoff. On the other hand, a much better outcome 
for both players happens when neither of them confesses. However, this is not 
a stable solution — even if it is carefully planned out — since each of the players 
would be tempted to defect and thereby serve less time. 


The situation modeled by the Prisoner’s Dilemma arises naturally in a lot of different 
situations; we give below an ISP routing context. 


Example 1.2 (ISP routing game) Consider Internet Service Providers (ISPs) 
that need to send traffic to each other. In routing traffic that originates in one ISP 
with destination in a different ISP, the routing choice made by the originating ISP 
also affects the load at the destination ISP. We will see here how this situation 
gives rise to exactly the Prisoner’s dilemma described above. 

Consider two ISPs (Internet Service Providers), as depicted in Figure 1.1, each 
having its own separate network. The two networks can exchange traffic via two 
transit points, called peering points, which we will call C and S. 

In the figure we also have two origin—destination pairs s; and t; each crossing 
between the domains. Suppose that ISP 1 needs to send traffic from point s; in his 
own domain to point f; in 2nd ISP’s domain. ISP 1 has two choices for sending its 
traffic, corresponding to the two peering points. ISPs typically behave selfishly 
and try to minimize their own costs, and send traffic to the closest peering point, 
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Figure 1.1. The ISP routing problem. 


as the ISP with the destination node must route the traffic, no matter where it 
enters its domain. Peering point C is closer, using this peering point ISP 1 incurs 
a cost of 1 unit (in sending traffic along 1 edge), whereas if it uses the farther 
peering point S, it incurs a cost of 2. 

Note that the farther peering point S is more directly on route to the destination 
t,, and hence routing through S results in shorter overall path. The length of the 
path through C is 4 while through S is 2, as the destination is very close to S. 

The situation described for ISP 1 routing traffic from s; to t; is in a way 
analogous to a prisoner’s choices in the Prisoner’s Dilemma: there are two choices, 
one is better from a selfish perspective (“confess” or route through peering point 
C), but hurts the other player. To make our routing game identical to the Prisoner’s 
Dilemma, assume that symmetrically the 2nd ISP needs to send traffic from point 
so in his domain to point ft in the Ist ISP’s domain. The two choices of the 
two ISPs lead to a game with cost matrix identical to the matrix above with C 
corresponding to “confess” and S corresponding to remaining “silent.” 


1.1.2 The Tragedy of the Commons 


In this book we will be most concerned with situations where many participants interact, 
and such situations are naturally modeled by games that involve many players: there 
are thousands of ISPs, and many millions of traffic streams to be routed. We will give 
two examples of such games, first a multiplayer version of the Prisoner’s Dilemma 
that we will phrase in terms of a pollution game. Then we will discuss the well-known 
game of Tragedy of the Commons. 


Example 1.3 (Pollution game) This game is the extension of Prisoner’s 
Dilemma to the case of many players. The issues modeled by this game arise 
in many contexts; here we will discuss it in the context of pollution control. As- 
sume that there are n countries in this game. For a simple model of this situation, 
assume that each country faces the choice of either passing legislation to control 
pollution or not. Assume that pollution control has a cost of 3 for the country, but 
each country that pollutes adds 1 to the cost of all countries (in terms of added 
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health costs, etc.). The cost of controlling pollution (which is 3) is considerably 
larger than the cost of 1 a country pays for being socially irresponsible. 

Suppose that k countries choose not to control pollution. Clearly, the cost 
incurred by each of these countries is k. On the other hand, the cost incurred by 
the remaining n — k countries is k + 3 each, since they have to pay the added 
cost for their own pollution control. The only stable solution is the one in which 
no country controls pollution, having a cost of n for each country. In contrast, 
if they all had controlled pollution, the cost would have been only 3 for each 
country. 


The games we have seen so far share the feature that there is a unique optimal 
“selfish” strategy for each player, independent of what other players do. No matter 
what strategy the opponent plays, each player is better off playing his or her selfish 
strategy. Next, we will see a game where the players’ optimal selfish strategies depend 
on what the other players play. 


Example 1.4 (Tragedy of the commons) We will describe this game in the 
context of sharing bandwidth. Suppose that n players each would like to have part 
of a shared resource. For example, each player wants to send information along 
a shared channel of known maximum capacity, say 1. In this game each player 
will have an infinite set of strategies, player i’s strategy is to send x; units of flow 
along the channel for some value x; € [0, 1]. 

Assume that each player would like to have a large fraction of the bandwidth, 
but assume also that the quality of the channel deteriorates with the total bandwidth 
used. We will describe this game by a simple model, using a benefit or payoff 
function for each set of strategies. If the total bandwidth yi x; exceeds the channel 
capacity, no player gets any benefit. If }° jxi <i then the value for player i is 
xi(1— >> j x;). This models exactly the kind of trade-off we had in mind: the 
benefit for a player deteriorates as the total assigned bandwidth increases, but it 
increases with his own share (up to a point). 


To understand what stable strategies are for a player, let us concentrate on player 
i, and assume that t = )> jgiXi <1 flow is sent by all other players. Now player i 
faces a simple optimization problem for selecting his flow amount: sending x flow 
results in a benefit of x(1 — t — x). Using elementary calculus, we get that the optimal 
solution for player i is x = (1 — t)/2. A set of strategies is stable if all players are 
playing their optimal selfish strategy, given the strategies of all other players. For this 
case, this means that x; = (1 — > vik j)/2 for all i, which has a unique solution in 
x; = 1/(n + 1) for all i. 

Why is this solution a tragedy? The total value of the solution is extremely low. 
The value for player i is x;(1 — ii xj)=1/M+ 1)?, and the sum of the values 
over all payers is then n/(n + 1)? © 1/n. In contrast, if the total bandwidth used is 
>); x; = 1/2 then the total value is 1/4, approximately n/4 times bigger. In this game 
the n users sharing the common resource overuse it so that the total value of the shared 
resource decreases quite dramatically. The pollution game above has a similar effect, 
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where the common resource of the environment is overused by the n players increasing 
the cost from 3 to n for each players. 


1.1.3 Coordination Games 


In our next example, there will be multiple outcomes that can be stable. This game is 
an example of a so-called “coordination game.” A simple coordination game involves 
two players choosing between two options, wanting to choose the same. 


Example 1.5 (Battle of the sexes) Consider that two players, a boy and a girl, 
are deciding on how to spend their evening. They both consider two possibilities: 
going to a baseball game or going to a softball game. The boy prefers baseball and 
the girl prefers softball, but they both would like to spend the evening together 
rather than separately. Here we express the players’ preferences again via payoffs 
(benefits) as follows. 


2 6 


Clearly, the two solutions where the two players choose different games are 
not stable — in each case, either of the two players can improve their payoff by 
switching their action. On the other hand, the two remaining options, both attend- 
ing the same game, whether it is softball or baseball, are both stable solutions; the 
girl prefers the first and the boy prefers the second. 


Coordination games also arise naturally in many contexts. Here we give an example 
of a coordination game in the context of routing to avoid congestion. The good outcomes 
in the Battle of the Sexes were to attend the same game. In contrast, in the routing game, 
good outcomes will require routing on different paths to avoid congestion. Hence, this 
will be an “anticoordination” game. 


Example 1.6 (Routing congestion game) Suppose that two traffic streams ori- 
ginate at proxy node O, and need to be routed to the rest of the network, as 
shown in Figure 1.2. Suppose that node O is connected to the rest of the network 
via connection points A and B, where A is a little closer than B. However, both 
connection points get easily congested, so sending both streams through the same 
connection point causes extra delay. Good outcomes in this game will be for the 
two players to “coordinate” and send their traffic through different connection 
points. 
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Traffic 1 


Traffic 2 


Figure 1.2. Routing to avoid congestion and the corresponding cost matrix. 


We model this situation via a game with the two streams as players. Each 
player has two available strategies — routing through A or routing through 
B — leading to four total possibilities. The matrix of Figure 1.2 expresses the 
costs to the players in terms of delays depending on their routing choices. 


1.1.4 Randomized (Mixed) Strategies 


In the games we considered so far, there were outcomes that were stable in the sense 
that none of players would want to individually deviate from such an outcome. Not all 
games have such stable solutions, as illustrated by the following example. 


Example 1.7 (Matching pennies) Two payers, each having a penny, are asked 
to choose from among two strategies — heads (#H) and tails (T). The row player 
wins if the two pennies match, while the column player wins if they do not match, 
as shown by the following payoff matrix, where | indicates win and —1 indicated 
loss. 


One can view this game as a variant of the routing congestion game in which the 
column player is interested in getting good service, hence would like the two players to 
choose different routes, while the row player is interested only in disrupting the column 
player’s service by trying to choose the same route. It is easy to see that this game has 
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no stable solution. Instead, it seems best for the players to randomize in order to thwart 
the strategy of the other player. 


1.2 Games, Strategies, Costs, and Payoffs 


We have given examples of games and discussed costs, payoffs, and strategies in an 
informal way. Next we will define such a game more formally. The games we considered 
above were all one-shot simultaneous move games, in that all players simultaneously 
chose an action from their set of possible strategies. 


1.2.1 Defining a Simultaneous Move Game 


Formally, such a game consists of a set n of players, {1, 2, ...,n}. Each player i has his 
own set of possible strategies, say S;. To play the game, each player i selects a strategy 
s; € S;. We will use s = (51, ..., S,) to denote the vector of strategies selected by the 
players and S = x; 8S; to denote the set of all possible ways in which players can pick 
strategies. 

The vector of strategies s € S selected by the players determine the outcome for 
each player; in general, the outcome will be different for different players. To specify 
the game, we need to give, for each player, a preference ordering on these outcomes by 
giving a complete, transitive, reflexive binary relation on the set of all strategy vectors 
S; given two elements of S, the relation for player i says which of these two outcomes 
i weakly prefers; we say that i weakly prefers S; to Sz if i either prefers S; to Sz or 
considers them as equally good outcomes. For example, in the matching pennies game 
the row player prefers strategy vectors in which the two pennies match and the column 
player prefers those in which the pennies do not match. 

The simplest way to specify preferences is by assigning, for each player, a value to 
each outcome. In some games it will be natural to think of the values as the payoffs to 
players and in others as the costs incurred by players. We will denote these functions 
by u; : S > Rand c; : S > R, respectively. Clearly, costs and payoffs can be used 
interchangeably, since u;(s) = —c;(s). 

If we had defined, for each player i, u; to be simply a function of s;, the strategy 
chosen by player i, rather than s, the strategies chosen by all n players, then we would 
have obtained n independent optimization problems. Observe the crucial difference 
between this and a game — in a game, the payoff of each player depends not only on 
his own strategy but also on the strategies chosen by all other players. 


1.2.2 Standard Form Games and Compactly Represented Games 


To develop an algorithmic theory of games, we need to discuss how a game is specified. 
One option is to explicitly list all possible strategies, and the preferences or utilities 
of all players. Expressing games in this form with a cost or utility function is called 
the standard form or matrix form of a game. It is very convenient to define games in 
this way when there are only 2 players and the players have only a few strategies. We 


10 BASIC SOLUTION CONCEPTS AND COMPUTATIONAL ISSUES 


have used this form in the previous section for defining the Prisoner’s Dilemma and 
the Battle of the Sexes. 

However, for most games we want to consider, this explicit representation is expo- 
nential sized in the natural description of the game (possibly bigger or even infinite). 
Most games we want to consider have many players, e.g., the many traffic streams or 
the many ISPs controlling such streams. (In fact, in Part III of this book, we will even 
encounter games with infinitely many players, modeling the limiting behavior as the 
number of players gets very large.) For an example, consider the pollution game from 
Subsection 1.1.2, where we have n players, each with two possible strategies. There 
are 2” possible strategy vectors, so the explicit representation of the game requires 
assigning values to each of these 2” strategies. The size of the input needed to describe 
the game is much smaller than 2”, and so this explicit representation is exponentially 
larger than the description of the game. 

Another reason that explicit representation of the payoffs can be exponentially large 
is that players can have exponentially many strategies in the natural size of the game. 
This happens in routing games, since the strategy space of each player consists of all 
possible paths from source to destination in the network. In the version of the Tragedy 
of the Commons, we discussed in Section 1.1.2 players have infinite strategy sets, since 
any bandwidth x € [0, 1] is a possible strategy. 

Such exponential (and superexponential) descriptions can sometimes be avoided. For 
example, the payoff may depend on the number of players selecting a given strategy, 
rather than the exact subset (as was the case for the pollution game). The routing 
congestion game discussed in Chapter 18 provides another example, where the cost 
of a chosen path depends on the total traffic routed on each edge of the path. Another 
possibility for compact representation is when the payoff of a player may depend on 
the strategies chosen by only a few other players, not all participants. Games with such 
locality properties are discussed in detail in Chapter 7. 


1.3 Basic Solution Concepts 


In this section we will introduce basic solution concepts that can be used to study the 
kinds of games we described in the previous section. In particular, we will formalize 
the notion of stability that we informally used in discussing solutions to some of the 
games. 


1.3.1 Dominant Strategy Solution 


The Prisoner’s Dilemma and the Pollution Game share a very special property: in each 
of these games, each player has a unique best strategy, independent of the strategies 
played by the other players. We say that a game has a dominant strategy solution if it 
has this property. 

More formally, for a strategy vector s € S we use 5; to denote the strategy played by 
player i and s_; to denote the (n — 1)-dimensional vector of the strategies played by all 
other players. Recall that we used u;(s) to denote the utility incurred by player i. We 
will also use the notation u;(s;, s_;) when it is more convenient. Using this notation, 
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a strategy vector s € Sis a dominant strategy solution, if for each player i, and each 
alternate strategy vector s’ € S, we have that 
WAN 8) Se es) 

It is important to notice that a dominant strategy solution may not give an opti- 
mal payoff to any of the players. This was the case in both the Prisoner's Dilemma 
and the Pollution Game, where it is possible to improve the payoffs of all players 
simultaneously. 

Having a single dominant strategy for each player is an extremely stringent require- 
ment for a game and very few games satisfy it. On the other hand, mechanism design, 
the topic of Part II of this book, aims to design games that have dominant strategy so- 
lutions, and where this solution leads to a desirable outcome (either socially desirable, 
or desirable for the mechanism designer). We illustrate this, using the simple example 
of Vickrey auction. 


1.3.2 Vickrey Auction: Designing Games with Dominant 
Strategy Solutions 


Perhaps the most common situation in which we need to design a game is an auction. 
Suppose that we are faced with designing an auction to sell a valuable painting. To 
model this situation as a game, assume that each player (bidder) i has a value v; for 
the painting. His value or payoff for not winning it is 0, and his payoff for winning it 
at a price of p is v; — p. The strategy of each player is simply his bid. What is a good 
mechanism (or game) for selling this painting? Here we are considering single-shot 
games, so assume that each player is asked to state his bid for the painting in a sealed 
envelope, and we will decide who to award the painting to and for what price, based 
on the bids in the envelopes. 

Perhaps the most straightforward auction would be to award the painting to the 
highest bidder and charge him his bid. This game does not have a dominant strategy 
solution. A player’s best strategy (bid) depends on what he knows or assumes about the 
strategies of the other players. Deciding what value to bid seems like a hard problem, 
and may result in unpredictable behavior. See Section 1.6 for more discussion of a 
possible solution concept for this game. 

Vickrey’s mechanism, called second price auction, avoids these bidding problems. 
As before, the painting is awarded to the bidder with highest bid; however, the amount 
he is required to pay is the value of the second highest bid. This second price auction 
has the remarkable property that each player’s dominant strategy is to report his true 
value as bid, independent of the strategies of the rest of the players! Observe that even 
if his true value happens to be very high, he is in no danger of overpaying if he reports 
it —if he wins, he will pay no more than the second highest bid. 

Let us observe two more properties of the Vickrey auction. First, it leads to the 
desirable outcome of the painting being awarded to the bidder who values it most. 
Indeed, the larger goal of mechanism design is often to design mechanisms in which 
the selfish behavior of players leads to such a socially optimal outcome. For example, 
when the government auctions off goods, such as the wireless spectrum auctions, their 
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goal is typically not to make as large a profit as possible, but rather to get the spectrum 
in the hands of companies that have the best technology to offer to customers. 

Another nice feature of a dominant strategy game, such as Vickrey auction, is 
that it is extremely simple for the players to play such a game, since each player’s 
optimal strategy is independent of other players’ choices. In fact, one can implement 
all dominant strategy games by simply asking all players for their valuation functions 
and letting the game designer “play” the game for them. This is called the revelation 
principle (see Chapter 9). (In this book, we will not consider the complex issue of how 
players arrive at their own valuation function.) Unfortunately, in many contexts the 
valuation function of a player can be very complex and direct revelation may lead to 
extensive, maybe even exponential, communication (see Chapter 11). Another problem 
with direct revelation mechanisms is that they assume the presence of a central trusted 
party. Chapter 8 shows how cryptographic techniques can help a group of players 
implement such a mechanism or game without a trusted party. 


1.3.3 Pure Strategy Nash Equilibrium 


Since games rarely possess dominant strategy solutions, we need to seek a less stringent 
and more widely applicable solution concept. A desirable game-theoretic solution is 
one in which individual players act in accordance with their incentives, maximizing 
their own payoff. This idea is best captured by the notion of a Nash equilibrium, which, 
despite its shortcomings (mentioned below), has emerged as the central solution concept 
in game theory, with extremely diverse applications. The Nash equilibrium captures 
the notion of a stable solution, discussed in Section 1.1 and used in the Tragedy of the 
Commons and the Battle of the Sexes — a solution from which no single player can 
individually improve his or her welfare by deviating. 

A strategy vector s € Sis said to be a Nash equilibrium if for all players i and each 
alternate strategy s; € S;, we have that 


Uj(S;, S-i) > uj(S;, S_i). 


In other words, no player i can change his chosen strategy from s; to s; and thereby 
improve his payoff, assuming that all other players stick to the strategies they have 
chosen in s. Observe that such a solution is self-enforcing in the sense that once the 
players are playing such a solution, it is in every player’s best interest to stick to his or 
her strategy. 

Clearly, a dominant strategy solution is a Nash equilibrium. Moreover, if the solution 
is strictly dominating (i.e., switching to it always strictly improves the outcome), it is 
also the unique Nash equilibrium. However, Nash equilibria may not be unique. For 
example, coordination games have multiple equilibria. 

We already know that Nash equilibria may not be optimal for the players, since dom- 
inant strategy solutions are Nash equilibria. For games with multiple Nash equilibria, 
different equilibria can have (widely) different payoffs for the players. For example, by 
a small change to the payoff matrix, we can modify the Battle of the Sexes game so that 
it still has two stable solutions (the ones in which both players go to the same activity); 
however, both players derive a much higher utility from one of these solutions. In 
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Part II of this book we will look more carefully at the quality of the best and worst 
equilibria in different games. 

The existence of multiple Nash equilibria makes this solution concept less convinc- 
ing as a prediction of what players will do: which equilibrium should we expect them 
to play? And with independent play, how will they know which equilibrium they are 
supposed to coordinate on? But at least a Nash equilibrium is stable — once proposed, 
the players do not want to individually deviate. 


1.3.4 Mixed Strategy Nash Equilibria 


The Nash equilibria we have considered so far are called pure strategy equilibria, since 
each player deterministically plays his chosen strategy. As illustrated by the Matching 
Pennies game, a game need not possess any pure strategy Nash equilibria. However, if 
in the matching pennies game, the players are allowed to randomize and each player 
picks each of his two strategies with probability 1/2, then we obtain a stable solution 
in a sense. The reason is that the expected payoff of each player now is 0 and neither 
player can improve on this by choosing a different randomization. 

When players select strategies at random, we need to understand how they evaluate 
the random outcome. Would a player prefer a choice that leads to a small positive utility 
with high probability, but with a small probability leads to a large negative utility? Or, 
is it better to have a small loss with high probability, and a large gain with small 
probability? For the notion of mixed Nash equilibrium, we will assume that players are 
risk-neutral; that is, they act to maximize the expected payoff. 

To define such randomized strategies formally, let us enhance the choices of players 
so each one can pick a probability distribution over his set of possible strategies; such a 
choice is called a mixed strategy. We assume that players independently select strategies 
using the probability distribution. The independent random choices of players leads 
to a probability distribution of strategy vectors s. Nash (1951) proved that under this 
extension, every game with a finite number of players, each having a finite set of 
strategies, has a Nash equilibrium. 


Theorem 1.8 Any game with a finite set of players and finite set of strategies 
has a Nash equilibrium of mixed strategies. 


This theorem will be further discussed and proved for the two player case in Chapter 2. 
An important special case of 2 player games is zero-sum games, games in which the 
gain of one player is exactly the loss of the other player. Nash equilibria for these 
games will be further discussed in Section 1.4. 


1.3.5 Games with No Nash Equilibria 


Both assumptions in the theorem about the finite set of players and finite strategy sets 
are important: games with an infinite number of players, or games with a finite number 
of players who have access to an infinite strategy set may not have Nash equilibria. A 
simple example of this arises in the following pricing game. 
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Buyer A 

Seller 1 
Buyer B 

Seller 2 
Buyer C 


Figure 1.3. Sellers 1 and 2 are selling identical products to buyers A, B, and C. 


Example 1.9 (Pricing game) Suppose two players sell a product to three pos- 
sible buyers, as shown in Figure 1.3. Each buyer wants to buy one unit of the 
product. 

Buyers A and C have access to one seller only, namely 1 and 2, respectively. 
However, buyer B can buy the product from any of the two sellers. All three 
buyers have a budget of 1, or have maximum value 1 for the item, i.e., will not 
buy the product if the price is above 1. The sellers play a pricing game — they 
each name a price p; in the interval [0, 1]. Buyers A and C buy from sellers 1 
and 2, respectively. On the other hand, B buys from the cheaper seller. To fully 
specify the game, we have to set a rule for breaking ties. Let us say that if both 
sellers have the same price, B buys from seller 1. For simplicity, we assume no 
production costs, so the income of a seller is the sum of the prices at which they 
sold goods. 

Now, one strategy for each seller is to set a price of pj = 1, and guarantee an 
income of | from the buyer who does not have a choice. Alternatively, they can 
also try to compete for buyer B. However, by the rules of this game they are not 
allowed to price-discriminate; i.e., they cannot sell the product to the two buyers 
at different prices. In this game, each player has uncountably many available 
strategies, i.e., all numbers in the interval [0, 1]. It turns out that this game does 
not have a Nash equilibrium, even if players are allowed to use mixed strategies. 

To see that no pure strategy equilibrium exists, note that if p; > 1/2, player 2 
will slightly undercut the price, set it at 1/2 < pz < p,, and have income of more 
than 1, and then in turn player 1 will undercut player 2, etc. So we cannot have 
P, > 1/2 in an equilibrium. If p; < 1/2, the unique best response for player 2 
is to set pp = 1. But then player 1 will increase his price, so py < 1/2 also does 
not lead to an equilibrium. It is a bit harder to argue that there is also no mixed 
strategy equilibrium in this game. 


1.3.6 Correlated Equilibrium 


A further relaxation of the Nash equilibrium notion was introduced by Aumann (1959), 
called correlated equilibrium. The following simple example nicely illustrates this 
notion. 


Example 1.10 (Traffic light) The game we consider is when two players drive 
up to the same intersection at the same time. If both attempt to cross, the result 
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is a fatal traffic accident. The game can be modeled by a payoff matrix where 
crossing successfully has a payoff of 1, not crossing pays 0, while an accident 
costs —100. 


I Cross Stop 
-100 0 
Cross 
-100 1 
1 0 
Stop 
0 0 


This game has three Nash equilibria: two correspond to letting only one 
car cross, the third is a mixed equilibrium where both players cross with an 
extremely small probability « = 1/101, and with €* probability they crash. 
The first two equilibria have a payoff of 1. The last one is more fair, but 
has low expected payoff (~0.0001), and also has a positive chance of a car 
crash. 

In a Nash equilibrium, players choose their strategies independently. In con- 
trast, in a correlated equilibrium a coordinator can choose strategies for both 
players; however, the chosen strategies have to be stable: we require that the 
each player find it in his or her interest to follow the recommended strat- 
egy. For example, in a correlated equilibrium the coordinator can randomly let 
one of the two players cross with any probability. The player who is told to 
stop has 0 payoff, but he knows that attempting to cross will cause a traffic 
accident. 


Correlated equilibria will be discussed in detail in Section 2.7. Formally, this notion 
assumes an external correlation device, such as a trusted game coordinator, or some 
other physical source. A correlated equilibrium is a probability distribution over strategy 
vectors s € x;5;. Let p(s) denote the probability of strategy vector s, where we will 
also use the notation p(s) = p(s;, s_;) when talking about a player i. The distribution 
is a correlated equilibrium if for all players i and all strategies s;, s; € S;, we have the 
inequality 


» P(Si, S—i)Ui(S;, S-i) = Ss p(si, S—i)u;(s;, Si). 


Sj Sj 


In words, if player i receives a suggested strategy s;, the expected profit of the player 
cannot be increased by switching to a different strategy s; € S;. Nash equilibria are 
special cases of correlated equilibria, where the distribution over S is the product of 
independent distributions for each player. However, correlation allows a richer set of 
equilibria as we will see in Section 2.7. 
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1.4 Finding Equilibria and Learning in Games 


In this section we consider two closely related issues: how easy is it to find an equi- 
librium, and does “natural game play” lead the players to an equilibrium? Ideally, a 
perfect solution concept is one which is computationally easy to find, and also easy to 
find by players playing independently. 


1.4.1 Complexity of Finding Equilibria 


The complexity of finding Nash and correlated equilibria will be discussed in detail in 
Chapters 2 and 3. Here we give a short overview. We then discuss two-player zero-sum 
games in more detail and show that for such games a Nash equilibrium can be found 
efficiently using linear programming. It turns out that even general two-player games 
have a character different from that of games with three or more players. For example, 
two-player games where payoffs are rational numbers always admit a solution with 
rational probabilities, and this is not true for games with three or more players. Games 
with two players will be discussed in greater detail in Chapter 3. 

We will discuss the complexity of finding Nash equilibrium in Chapter 2. NP- 
completeness, the “standard” way of establishing intractability of individual problems, 
does not seem to be the right tool for studying the complexity of Nash equilibria. 
Instead, we will use PPAD-completeness (see Chapter 2 for the definition). The problem 
of finding a Nash equilibrium is PPAD-complete even for two-player games in standard 
form. 

In contrast, we will see in Section 2.7 that correlated equilibria are computationally 
easier. Correlated equilibria form a convex set and hence can be found in polynomial 
time for games defined explicitly via their payoff matrices, and finding a correlated 
equilibrium is polynomially solvable even in many compactly represented games. 
However, finding an “optimal” correlated equilibrium is computationally hard in many 
natural classes of compactly represented games. 


1.4.2 Two-Person Zero-Sum Games 


Here we consider two-player zero-sum games in more detail. A two-player game is a 
zero-sum game if the sum of the payoffs of the two players is zero for any choice of 
strategies. For such games it is enough to give the payoffs of the row player. Let A be 
the matrix of these payoffs, representing the winnings of the row player and the loss of 
the column player. 

Recall from Theorem 1.8 that a Nash equilibrium of mixed strategies always exists. 
We will use this fact to show that an equilibrium can be found using linear programming. 
Consider a pair of probability distributions p* and q* for the row and column players 
that form a Nash equilibrium. The expected value paid by the column player to the row 
player can be expressed as v* = p* Aq” (if we think of p* as a row vector and q* as a 
column vector). 

A Nash equilibrium has the property that even if the players know the strategies 
played by the other players (the probability distribution they are using), they cannot 
be better off by deviating. With this in mind, consider a strategy p for the row player. 
The expected payoffs for different strategies of the column player will be pA. Once 
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p is known, the column player will want to minimize his loss, and play strategies that 
correspond to the minimum entries in pA. So the best publicly announced strategy for 
the row player is to maximize this minimum value. This best public strategy can be 
found by solving the following linear program: 


U,; = max v 
0 


Pp 
Spi =1 
(pA); = v for all j, 


where we use (pA), to denote the jth entry of the vector pA. The optimum value v, 
is the row player’s maximum safe value, the maximum value he or she can guarantee 
to win by playing a mixed strategy p that will be known to the column player. 

How does v, and the Nash value v* compare? Clearly v, < v*, since the row player, 
can guarantee to win v,, So must win at least this much in any equilibrium. On the other 
hand, an equilibrium is a strategy that is stable even if known to the opponent, so it 
must be the case that the column player is in fact selecting the columns with minimum 
value p* A, so we must have v* < v,, and hence v, = v*. 

Similarly, we can set up the analogous linear program to get the value v,, the column 
player’s minimum safe value, the minimum loss the column player can guarantee by 
playing a mixed strategy q that will be known to the row player: 


Ue = min v 
> 0 


q 
dd =) 
J 
(Aq); < v for alli. 


where we use (Aq); to denote the ith entry of the vector Ag. We can argue that v* = vu, 
also holds. Hence we get that v. = v;, the row players’ maximum guaranteed win is 
the same as the column players’ minimum guaranteed loss. This will imply that the 
optimal solutions to this pair of linear programs form a Nash equilibrium. 


Theorem 1.11. Optimum solutions for the above pair of linear programs give 
probability distributions that form a Nash equilibrium of the two-person zero-sum 
game. 


PROOF Let p and q denote optimum solutions to the two linear programs. We 
argued above that v, = v;. If the players play this pair of strategies, then the row 
player cannot increase his win, as the column player is guaranteed by his strategy 
not to lose more than v,. Similarly, the column player cannot decrease his loss, as 
the row player is guaranteed to win v, by his strategy. So the pair of strategies is 
at equilibrium. 


Readers more familiar with linear programming will notice that the two linear 
programs above are duals of each other. We established that v, = v, using the existence 
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of a Nash equilibrium from Theorem 1.8. Linear programming duality also implies 
that the two values v, and v, are equal. Once we know the values are equal, the proof 
of Theorem 1.4.2 shows that the optimal solutions form a Nash equilibrium, so linear 
programming duality yields a proof that a Nash equilibrium exists in the special case 
of zero-sum two-person games. 


1.4.3 Best Response and Learning in Games 


It would be desirable for a solution concept to satisfy a stronger condition than simply 
being polynomial computable: it should be the case that natural game playing strategies 
quickly lead players to either find the equilibrium or at least converge to an equilibrium 
in the limit. 

Maybe the most natural “game playing” strategy is the following “best response.” 
Consider a strategy vector s, and a player i. Using the strategy vector s player i gets 
the value or utility u;(s). Changing the strategy s; to some other strategy s; € S; the 
player can change his utility to u;(s;, s_;), assuming that all other players stick to their 
strategies in s_;. We say that a change from strategy s; to s/ is an improving response 
for player i if uj(s}, s_;) > uj(s) and best response if s; maximizes the players’ utility 
MAaXy'es, Ui(S;, S_i). Playing a game by repeatedly allowing some player to make an 
improving or a best response move is perhaps the most natural game play. 

In some games, such as the Prisoner’s Dilemma or the Coordination Game, this 
dynamic leads the players to a Nash equilibrium in a few steps. In the Tragedy of 
the Commons the players will not reach the equilibrium in a finite number of steps, 
but the strategy vector will converge to the equilibrium. In other games, the play may 
cycle, and not converge. A simple example is matching pennies, where the payers will 
cycle through the 4 possible strategy vectors if they alternate making best response 
moves. While this game play does not find a pure equilibrium (as none exists) in some 
sense we can still say that best response converges to the equilibrium: the average 
payoff for the two players converges to 0, which is the payoff at equilibrium; and even 
the frequencies at which the 4 possible strategy vectors are played converge to the 
probabilities in equilibrium (1/4 each). 

Results about the outcome of such game playing strategies will be discussed in 
Chapter 4. We will see that best response behavior is not strong enough to guarantee 
convergence in most games. Instead, we will consider improving response type “learn- 
ing” strategies that react to the frequencies played so far, rather than just to the current 
game play. We will show that in the special case of 2-player zero-sum games such 
natural game playing does converge to a Nash equilibrium. In general, even learning 
strategies do not converge to Nash equilibria, instead they converge to the larger region 
of correlated equilibria. 


1.5 Refinement of Nash: Games with Turns and Subgame 
Perfect Equilibrium 


Nash equilibria has become the central solution concept in game theory, despite its 
shortcomings, such as the existence of multiple equilibria. Since the emergence of this 
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concept in the 1950s, there have been many refinements considered that address the 
selection of the “right” equilibrium concept. Here we will consider one such refinement 
for games with turns. 

Many games have multiple turns of moves. Card games or board games all have 
turns, but games modeling many economic situations also have this form: a service 
provider sets up a basic service (turn 1) and then users decide to use the service or 
decide not to (turn 2). 

How does Nash equilibrium extend to games with turns? We can reduce such games 
to simultaneous move games by having each player select a “full strategy” up front, 
rather than having them select moves one at a time. By a “full strategy” we mean a 
strategy for each turn, as a function of the state of the game. One issue with such 
strategies is that they tend to become rather large: a full strategy for chess would state 
the next move for any possible sequence of previous moves. This is a huge set in the 
natural description of the game in terms of the rules of chess. Games with turns is 
another example of a compactly represented game. We will see more on how to work 
with this type of compactly represented games in Chapter 3. 

Here our focus is to point out that in this context the notion of Nash equilibrium 
seems a bit weak. To see why, consider the following simple game. 


Example 1.12 (Ultimatum game) Assume that aseller S is trying to sell a good 
to buyer B. Assume that the interaction has two steps: first seller S offers a price 
p, and then buyer B reacts to the price. We assume the seller has no value for the 
good, his payoff is p if the sale occurs, and 0 otherwise. The buyer has a value 
vu for the good, so his payoff is v — p if he buys, and 0 if he does not. Here we 
are considering a full information game in which seller S is aware of the buyer’s 
value v, and hence we expect that the seller offers price p just under v, and the 
buyer buys. (Ignore for now the issue of what happens if the price is exactly v.) 
This game allows the first player to lead, and collect (almost) all the profit. 
This game is known as the ultimatum game when two players S and B need to 
divide up v amount of money. The game allows the first player S to make an 
“ultimatum” (in the form of a price in our context) on how to divide up the money. 


To think about this game as a one-shot simultaneous move game, we need to think 
of the buyer’s strategy as a function or the offered price. A natural strategy is to “buy if 
the price is under v.” This is indeed an equilibrium of the game, but the game has many 
other equilibria. The buyer can also have the strategy that he will buy only if the price 
p is at most some smaller value m < v. This seems bad at first (why leave the v — p 
profit on the table if the price is in the range m < p < v), but assuming that the buyer 
uses this alternate strategy, the seller’s best move is to offer price p = m, as otherwise 
he makes no profit. This pair of strategies is also a Nash equilibrium for any value m. 

The notion of subgame perfect equilibrium formalizes the idea that the alternate 
buyer strategy of buying only at p < m is unnatural. By thinking of the game as a 
simultaneous move game, the difference between the two players in terms of the order 
of moves, is diminished. The notion of subgame perfect Nash equilibrium has been 
introduced to strengthen the concept of Nash, and make the order of turns part of the 
definition. The idea is to require that the strategy played is Nash, even after any prefix 
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of the game is already played. We will see more about subgame perfect equilibrium as 
well as games with turns in Chapters 3 and 19. 


1.6 Nash Equilibrium without Full Information: 
Bayesian Games 


So far we talked about equilibrium concepts in full information games, where all play- 
ers know the utilities and strategies of all other players. When players have limited 
information, we need to consider strategies that are only based on the available informa- 
tion, and find the best strategy for the player, given all his or her available information. 
Such games will be discussed in more detail in Section 9.6. 

One source of limited information can come from not knowing properties and 
preferences of other players, and hence not knowing what strategies they will select. 
It is easiest to understand this issue by considering a game of cards, such as bridge. In 
such a game the players have information about the probability distribution of the other 
players’ cards, but do not know exactly what cards they have. A similar information 
model can also be used to model many other situations. We illustrate this by the 
Bayesian first price auction game. 


Example 1.13 (Bayesian First Price Auction) Recall the first price auction: 
all players state a bid, and the winner is the player with maximum bid, and has 
to pay his bid value as the price. What are optimal strategies for players in this 
auction? If the valuations of all players are common knowledge, then the player 
with maximum valuation would state the second valuation as his bid, and win the 
auction at the same (or slightly bigger) price as in the second price auction. But 
how should players bid if they do not know all other players’ valuations? Naturally, 
their bids will now depend on their beliefs about the values and knowledge of all 
other players. 

Here we consider the simple setup where players get their valuations from in- 
dependent probability distributions, and these distributions are public knowledge. 
How should player i bid knowing his own valuation v;, and the distribution of 
the valuation of the other players? Such games are referred to as Bayesian games, 
and are discussed in Section 9.6. For example, it is shown there that the unique 
Nash equilibrium in the case when player valuations come from independent and 
identical distributions is a nice analog of the second price auction: player i, whose 
own valuation is v;, should bid the expected second valuation conditioned on v; 
being the maximum valuation. 


1.7 Cooperative Games 


The games we talked about so far are all non-cooperative games — we assumed that 
individual players act selfishly, deviate alone from a proposed solution, if it is in their 
interest, and do not themselves coordinate their moves in groups. Cooperative game 
theory is concerned with situations when groups of players coordinate their actions. 
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First, in Section 1.7.1 we define the concept of strong Nash equilibrium, a notion 
extending the Nash equilibrium concept to cooperative situations. 

Then we consider games with transferable utility, i.e., games where a player with 
increased utility has the ability to compensate some other player with decreased utility. 
When considering games with transferable utility the main concern is to develop 
solution concepts for formalizing fair ways of sharing a value or dividing up a cost ina 
cooperative environment. There have been many different notions of fairness proposed. 
In Section 1.7.2 we will briefly review two of them. We refer the reader to Chapter 15 
for a more in-depth discussion of these two and other concepts. 


1.7.1 Strong Nash Equilibrium 


The closest notion from cooperative game theory to our discussion thus far is the 
concept of strong Nash equilibrium introduced by Aumann (1974). Consider a game 
and a proposed solution, a strategy for each player. In a cooperative game we assume 
that some group A of players can change their strategies jointly, assuming that they all 
benefit. Here we are assuming that the game has nontransferable utility, which means 
that in order for a coalition to be happy, we need to make sure that the utility of each 
member is increasing (or at least is not decreasing). 

We say that a vector of strategies forms a strong Nash equilibrium if no subset A 
of players has a way to simultaneously change their strategies, improving each of the 
participant’s welfare. More formally, for a strategy vector s and a set of players A let 
Sa denote the vector of strategies of the players in A and let s_4 denote the vector of 
strategies of the players not in A. We will also use u;(s,4, s_4) for the utility for player 
i in the strategy s. We say that in a strategy vector s a subset A of players has a joint 
deviation if there are alternate strategies s; € S; for i €¢ A forming a vector s’,, such 
that uj(s) < uj(s’,, s_a) for alli € A, and for at least one player in A the inequality is 
strict. A strategy vector s is strong Nash if no subset A has a joint deviation. 

The concept of strong Nash is very appealing, for strong Nash equilibria have a 
very strong reinforcing property. One problem with this concept is that very few games 
have such equilibria. A nice example of a game with strong Nash equilibria is the 
game version of the stable marriage problem where boys and girls form pairs based 
on preference lists for the other sex. For a proposed matching, the natural notion of 
deviation for this game is a pair deviating (a couple who prefer each other to their 
current partners). This game will be reviewed in detail in Chapter 10. Chapter 19 
considers network formation games, and will discuss another class of games where 
coalitions of size 2 (pairs) are the natural units causing instability of a solution. 


1.7.2 Fair Division and Costsharing: Transferable Utility Games 


When utility is transferable, we can think of the game as dividing some value or sharing 
a cost between a set of players. The goal of this branch of game theory is to understand 
what is a fair way to divide value or cost between a set of participants. We assume that 
there is a set N of n participants, or players, and each subset A of players is associated 
with a cost c(A) (or value v(A)). We think of c(A) as a cost associated with serving 
the group A of players, so c(N) is the cost of serving all N players. The problem is to 
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divide this cost c(V)) among the n players in a “fair” way. (In case of dividing a value 
v(A), we think of v(A) as the value that the set A can generate by itself.) 

A cost-sharing for the total cost c(NV) is a set of cost-shares x; for each player 
i € N. We assume that cost-sharing needs to be budget balanced; i.e., we require that 
Vien X17 = CCN). One of the key solution concepts in this area is that of a core. We say 
that the cost-sharing is in the core if no subset of players would decrease their shares 
by breaking away from the whole set. More formally, we say that the cost-share vector 
c is in the core if 0 ,- 4 x; < c(A) for all sets A. A violation of this inequality precisely 
corresponds to a set A of players who can benefit by breaking away. 

Given a notion of fair sharing, such as the core, there are a number of important 
questions one can ask. Given a cost function c, we want to know whether there is a 
cost-sharing x that is in the core. In Chapter 15 we will see that there are nice ways 
of characterizing problems that have a nonempty core. We will also be concerned with 
the complexity of finding a cost-sharing in the core, and deciding whether the core is 
nonempty. The computational complexity of determining whether the core is empty has 
been extensively studied for many fundamental games. If the core is empty or finding 
a solution in the core is an intractable problem, one can consider a relaxed version of 
this notion in which subsets of players secede only if they make substantial gains over 
being in the whole set NV. We will discuss these ideas in Chapter 15. 


Here we briefly review a very different proposal for what is a “fair” way to share 
cost, the Shapley value. One advantage of the Shapley value is that it always exists. 
However, it may not be in the core, even for games that have nonempty core. 


Example 1.14 (Shapley Value) Shapley value is based on evaluating the 
marginal cost of each player. If we order the player set N as 1,...,7 and use the 
notation that N; = {1,..., i} then the marginal cost of player i is c(N;) — cCNj_-1). 
Of course, this marginal cost depends on the order the players are considered. 
The Shapley value assigns cost-share x; to player i that is the expected value of 
this marginal cost over a random order of the players. 


In Chapter 15 we will show that the Shapley value can be characterized as the unique 
cost-sharing scheme satisfying a number of different sets of axioms. 


1.8 Markets and Their Algorithmic Issues 


Some of the most crucial regulatory functions within a capitalistic economy, such as 
ensuring stability, efficiency, and fairness, are relegated to pricing mechanisms, with 
very little intervention. It is for this reason that general equilibrium theory, which 
studied equilibrium pricing, occupied a central place within mathematical economics. 

From our viewpoint, a shortcoming of this theory is that it is mostly a nonalgo- 
rithmic theory. With the emergence of numerous new markets on the Internet and the 
availability of massive computational power for running these markets in a centralized 
or distributed manner, there is a need for a new, inherently algorithmic theory of mar- 
ket equilibria. Such algorithms can also help understand the repercussions to existing 
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prices, production, and consumption caused by technological advances, introduction 
of new goods, or changes to the tax structure. Chapters 5 and 6 summarize recent work 
along these lines. 

Central to ensuring stability of prices is that there be parity between the demand and 
supply of goods. When there is only one good in the market, such an equilibrium price 
is easy to determine — it is simply the price at which the demand and supply curves 
intersect. If the price deviates from the equilibrium price, either demand exceeds 
supply or vice versa, and the resulting market forces tend to push the price back to the 
equilibrium point. Perhaps the most celebrated result in general equilibrium theory, 
due to Arrow and Debreu (1954), shows the existence of equilibrium prices in a very 
general model of the economy with multiple goods and agents. 

It turns out that equilibria for several fundamental market models can be captured 
as optimal solutions to certain nonlinear convex programs. As a result, two algorithmic 
approaches present themselves — combinatorial algorithms for solving these convex 
programs and convex programming based approaches. These are covered in Chapters 
5 and 6, respectively. 


1.8.1 An Algorithm for a Simple Market 


In this section, we will give a gist of the models and algorithms studied using a very 
simple market model. Consider a market consisting of a set A of divisible goods and a 
set B of buyers. We are specified for each buyer i, the amount m; € Z* of money she 
possesses, and for each good j, the amount a; € Z* of this good. Each buyer i has 
access to only a subset, say S; C A of the goods. She is indifferent between goods in 
S;, but is interested in maximizing the total amount of goods obtained. An example of 
such a situation is when identical goods are sold in different markets and each buyer has 
access to only a subset of the markets; such a model is studied in Chapter 7. Without 
loss of generality we may assume that m; 4 0, a; 4 0, for each buyer i, 5S; 4 J, and 
for each good j, there is a buyer i such that 7 € Sj. 

Once the prices pj, ..., Py of the goods are fixed, a buyer i is only interested in the 
cheapest goods in S;, say S’ C S;. Any allocation of goods from S/ that exhausts her 
money will constitute her optimal basket of goods at these prices. 

Prices are said to be market clearing or equilibrium prices if there is a way to assign 
to each buyer an optimal basket of goods so that there is no surplus or deficiency of any 
of the goods i.e., demand equals supply. It turns out that equilibrium prices are unique 
for this market; see Chapter 5 for a proof in a more general setting. 

We will need the following notations and definitions. Define a bipartite graph G = 
(A, B, E) on vertex sets A and B as shown on Figure 1.4. The edge (j, 7) connects a 
good j to a buyer i such that j € S;. Because of the assumptions made, each vertex in 
G has non zero degree. For S C A of goods, let a(S) denote the total amount of goods 
in S,ie., a(S) = Vies a;. For a subset T C B of buyers, let m(T) = )0;-7 m; denote 
the total money possessed by buyers in T. 

The algorithm given below is iterative and always assigns uniform prices to all 
goods currently under consideration. For a set S of goods, let CS) denote the set of 
buyers who are interested in goods in S; [(S) = {i € B | S; AS 4}. This is the 
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Buyer | 
Good | 
Buyer 2 
Good 2 
Buyer 3 


Figure 1.4. The graph G on the left and the corresponding max-flow network N. 


neighborhood of S in G. We say that a uniform price x is feasible if 
VS CA, x- a(S) < m(T(S)), 


i.e., the total cost of S is at most the total money possessed by buyers interested in 
goods in S. With respect to a feasible x, we will say that set S C A is tight if x - a(S) = 
m(I(S)). The importance of feasibility is established by the following lemma. 


Lemma 1.15 — A uniform price of x on all goods is feasible if and only if all 
goods can be sold in such a way that each buyer gets goods that she is interested 
in. 


PROOF One direction is straightforward. If there is a subset S C A such that 
x -a(S) > m(T(S)) then goods in S cannot all be sold at price x since buyers 
interested in these goods simply do not have enough money. 

To prove the other direction, we will use network N (see Figure 1.4) obtained 
from the bipartite graph G for computing allocations of goods to buyers. Direct 
the edges of G from A to B and assign a capacity of infinity to all these edges. 
Introduce source vertex s and a directed edge from s to each vertex j € A with 
a capacity of x - a;. Introduce sink vertex f and a directed edge from each vertex 
i € Btot with a capacity of m;. 

Clearly, a way of selling all goods corresponds to a feasible flow in N that 
saturates all edges going out of s. We will show that if x is feasible, then such 
a flow exists in N. By the max-flow min-cut theorem, if no such flow exists, 
then the minimum cut must have capacity smaller than x - a(A). Let S be the 
set of goods on the s-side of a minimum cut. Since edges (j, i) for goods j € S 
have infinite capacity, "(S) must also be on the s-side of this cut. Therefore, the 
capacity of this cut is at least x -a(A — S) + m(T(S)). If this is less than x - a(A) 
then x - a(S) > m(T(S)), thereby contradicting the feasibility of x. 


If with respect to a feasible x, a set S is tight, then on selling all goods in S, the 
money of buyers in I(S) will be fully spent. Therefore, x constitutes market clearing 
prices for goods in S. The idea is to look for such a set S, allocate goods in S to T'(S), 
and recurse on the remaining goods and buyers. 

The algorithm starts with x = 0, which is clearly feasible, and raises x continuously, 
always maintaining its feasibility. It stops when a nonempty set goes tight. Let x* be 
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the smallest value of x at which this happens and let S* be the maximal tight set (it is 
easy to see that S* must be unique). 

We need to give procedures for finding x* and S*. Observe that x* is the largest value 
of x at which (s, AU B Ut) remains a min-cut in N. Therefore, x* can be computed 
via a binary search. After computing x*, compute the set of nodes that can reach ¢ in 
the residual graph of this flow. This set, say W, is the t-side of the (unique) maximal 
min-cut in N at x = x*. Then, S* = A — W, the set of goods on the s side of this cut. 

At prices x*, buyers in I'(S*) will have no surplus money left and increasing x any 
more will lead to infeasibility. At this point, the algorithm fixes the prices of goods 
in S* at x*. It computes a max-flow in N for x = x*, as suggested by Lemma 1.15. 
This flow gives an allocation of goods in S* to buyers in '(S*), which fully spends all 
the money m(I°(S*)). The same flow also shows that x* is feasible for the problem for 
goods A — S* and buyers B — I’(S*). 

In the next iteration, the algorithm removes S* and I’(S*), initializes the prices of 
the goods in A — S* to x*, and raises prices until a new set goes tight. The algorithm 
continues in this manner, iteratively finding prices of sets of goods as they go tight. It 
terminates when all goods have been assigned prices. 


Lemma 1.16 = The value x* is feasible for the problem restricted to goods in 
A — S* and buyers in B — 1(S*). Furthermore, in the subgraph of G induced on 
A — S* and B — 1(S*), all vertices have nonzero degree. 


PROOF In the max-flow computed in N for x = x*, the flow going through 
nodes in S* completely uses up the capacity of edges from I'(S*) to t. Therefore, 
all the flow going through nodes in A — S* must exit via nodes in B — '(S*). Now, 
the first claim follows from Lemma 1.15. Furthermore, a good 7 € A — S* must 
have nonzero degree to B — I(S*). Finally, since each buyer i € (B — I'(S*)) 
has nonzero degree in G and has no edges to S*, it must have nonzero degree to 
A-—S*. 


Theorem 1.17 = The above-stated algorithm computes equilibrium prices and 
allocations in polynomial time. 


PROOF At termination, all goods are assigned prices and are therefore fully sold. 
By the second claim in Lemma 1.16, when the algorithm terminates, each buyer 
must be in the neighborhood of one of the tight sets found and therefore must be 
allocated goods in return for her money. We need to show that each buyer gets 
her optimal bundle of goods. Let S* be the first tight set found by the algorithm. 
Since S* was a maximal tight set at x*, prices must strictly rise before a new 
set goes tight in the second iteration. Therefore, prices are monotone increasing 
across iterations and all goods in A — S* are assigned higher prices than x*. Since 
each buyer i € I'(S*) is allocated goods from S* only, she was given an optimal 
bundle. Now, the claim follows by induction. 

Clearly, the algorithm will execute at most |A| iterations. The time taken for 
one iteration is dominated by the time required for computing x* and S*. Observe 
that x* = m(T'(S*))/a(S*), i.e., its numerator and denominator are polynomial 
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sized integers. Therefore binary search for finding x* will take polynomial 
time. 
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Exercises 


1.1. Give a finite algorithm for finding a Nash equilibrium for a game with two players 
defined by a game matrix. Your algorithm may run in exponential time. 


1.2 Consider a two-player game given in matrix form where each player has n strategies. 
Assume that the payoffs for each player are in the range [0, 1] and are selected 
independently and uniformly at random. Show that the probability that this random 
game has a pure (deterministic) Nash equilibrium approaches 1 — 1/e as n goes to 
infinity. You may use the fact that lim(1 — 1/n)" = 1/e as rn goes to infinity. 


1.3. We have seen that finding a Nash in a two-person zero-sum game is significantly 
easier than general two-person games. Now consider a three-person zero-sum game, 
that is, a game in which the rewards of the three players always sums to zero. Show 
that finding a Nash equilibrium in such games is at least as hard as that in general 
two-person games. 


1.4 Consider an n person game in which each player has only two strategies. This game 
has 2” possible outcomes (for the 2” ways the n players can play), therefore the 
game in matrix form is exponentially large. To circumvent this, in Chapter 7 we 
will consider a special class of games called graphical games. The idea is that the 
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value (or payoff) of a player can depend only on a subset of players. We will define 
a dependence graph G, whose nodes are the players, and an edge between two 
players / and j represents the fact that the payoff of player i depends on the strategy 
of player j or vice versa. Thus, if node i has k neighbors, then its payoff depends 
only on its own strategy and the strategies of its k neighbors. 

Consider a game where the players have 2 pure strategies each and assume that 
the graph G is a tree with maximum degree 3. Give a polynomial time algorithm to 
decide if such a game has a pure Nash equilibrium. (Recall that there are 2” possible 
pure strategy vectors, yet your algorithm must run in time polynomial in n.) 


Consider an n player game in which each player has 2 strategies. For this problem, 
think of the strategies as “on” and “off.” For example, the strategy can be either to 
participate or not to participate in some event. Further more, assume that the game 
is symmetric, in that all players have the same payoff functions, and that the payoff 
for a player depends only on the strategy of the player and the number of people 
playing strategy “on.” So the game is defined by 2n values: Ugn(k) and Uor (Kk), which 
denote the payoff for playing the “on” and “off” strategies, assuming that k of the 
other players chose to play “on” fork =0,...,n-—1. 

Give a polynomial time algorithm to find a correlated equilibrium for such a 
game. Note that the input to this problem consists of the 2n numbers above. As 
usual, polynomial means polynomial in this input length. You may use the fact that 
linear programming is solvable in polynomial time. 


Consider a 2-person game in matrix form. Assume that both players have n pure 
strategies. In a Nash equilibrium a player may be required to play a mixed strategy 
that gives nonzero probability to all (or almost all) of his pure strategies. Strategies 
that mix between so many pure options are hard to play, and also hard to understand. 
The goal of this problem is to show that one can reach an almost perfect Nash 
equilibrium by playing strategies that only choose between a few of the options. 

We will use p/ to be the probability distribution for player j, so p/ is the proba- 
bility that player j will use his ith pure strategy. The support of a mixed strategy p/ 
for player j is the set S/ = {i : p/ > 0}, i.e., the set of different pure strategies that 
are used with nonzero probability. We will be interested in solutions where each 
player has a strategy with small support. 

Fora given e > 0, we will say that a set of mixed strategies p', p? is €-approximate 
Nash if for both players j = 1 or 2, and all other strategies f/ for this player, the 
expected payoff for player j using strategy // is at most «M more than his expected 
payoff using strategy p/, where M is the maximum payoff. 

Show that for any fixed « > 0 and any 2-player game with all nonnegative payoffs, 
there is an €-approximate Nash equilibrium such that both players play the following 
simple kind of mixed strategy. For each player /, the strategy selects a subset $; of at 
most O(log n) of player /’s pure strategies, and makes player / select one of the strate- 
gies in $; uniformly at random. The set 5; may be a multiset, i.e., may contain the 
same pure strategy more than once such a strategy is more likely to be selected by the 
random choice). The constant in the O(.) notation may depend on the parameter e. 

Hint: Consider any mixed Nash strategy with possibly large support, and try to 
simplify the support by selecting the subsets 5; for the two players based on this 
Nash equilibrium. 

The classical Bertrand game is the following. Assume that n companies, which 
produce the same product, are competing for customers. If each company i has a 
production level of gj, there will be gq = >0; gq; units of the product on the market. 
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Now, demand for this product depends on the price and if q units are on the 
market, price will settle so that all q units are sold. Assume that we are given a 
“demand-price curve” p(d), which gives the price at which all d units can be sold. 
Assume that p(d) is a monotone decreasing, differentiable function of d. With this 
definition, the income of the firm / will be q; p(q). Assume that production is very 
cheap and each firm will produce to maximize its income. 

(a) Show that the total income for a monopolistic firm, can be arbitrarily higher 
than the total income of many different firms sharing the same market. Hint: this is 
true for almost all price curves; you may want to use, e.g., p(d) = 1 — d. 

(b) Assume that p(d) is twice differentiable, monotone decreasing, and p’(d) < 0. 

Show that the monopolistic income is at most n times the total income of the n 
competing companies. 
Let V denote a set of n agents, labeled 1, 2, ..., n. Let 0 denote the root node and 
for any subset S C V, St denote the set SU {0}. Let G=(Vt, E) be a complete, 
undirected graph with edge costs c: E — Z* which satisfy the triangle inequality. 
For a subset $ C V, let c(S) denote the cost of a minimum spanning tree in the 
subgraph of G induced on $+. The spanning tree game asks for a budget balanced 
cost-sharing method for minimum spanning tree that lies on the core. 

Consider the following cost-sharing method for sharing the cost of building a 
minimum spanning tree in G among the rn agents. Find any minimum spanning 
tree, say T, and root it at vertex 0. Define the cost of agent i to be the cost of the 
first edge on the unique path from / to 0 in 7. Clearly, this cost-sharing method 
is budget balanced; i.e., the total cost retrieved from the n agents is precisely the 
cost of a minimum spanning tree in G. Show that this cost-sharing method is in the 
core, i.e., for any subset S C V, the total cost charged to agents in S is at most the 
cost they would incur if they were to directly connect to the root, i.e., c(5). 


CHAPTER 2 


The Complexity of Finding 
Nash Equilibria 


Christos H. Papadimitriou 


Abstract 


Computing a NASH equilibrium, given a game in normal form, is a fundamental problem for Algo- 
rithmic Game Theory. The problem is essentially combinatorial, and in the case of two players it 
can be solved by a pivoting technique called the Lemke—Howson algorithm, which however is ex- 
ponential in the worst case. We outline the recent proof that finding a NASH equilibrium is complete 
for the complexity class PPAD, even in the case of two players; this is evidence that the problem is 
intractable. We also introduce several variants of succinctly representable games, a genre important 
in terms of both applications and computational considerations, and discuss algorithms for correlated 
equilibria, a more relaxed equilibrium concept. 


2.1 Introduction 


NAsH’s theorem — stating that every finite game has a mixed NASH equilibrium (Nash, 
1951) —is a very reassuring fact: Any game can, in principle, reach a quiescent state, 
one in which no player has an incentive to change his or her behavior. One question 
arises immediately: Can this state be reached in practice? Is there an efficient algorithm 
for finding the equilibrium that is guaranteed to exist? This is the question explored in 
this chapter. 

But why should we be interested in the issue of computational complexity in con- 
nection to NASH equilibria? After all, a NASH equilibrium is above all a conceptual 
tool, a prediction about rational strategic behavior by agents in situations of conflict — 
a context that is completely devoid of computation. 

We believe that this matter of computational complexity is one of central importance 
here, and indeed that the algorithmic point of view has much to contribute to the debate 
of economists about solution concepts. The reason is simple: If an equilibrium concept 
is not efficiently computable, much of its credibility as a prediction of the behavior 
of rational agents is lost — after all, there is no clear reason why a group of agents 
cannot be simulated by a machine. Efficient computability is an important modeling 
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perequisite for solution concepts. In the words of Kamal Jain, “If your laptop cannot 
find it, neither can the market.’”! 


2.1.1 Best Responses and Supports 


Let us thus define NASH to be the following computational problem: Given a game 
in strategic form, find a NASH equilibrium. Since NASH calls for the computation 
of a real-valued distribution for each player, it seems primae facie to be a quest in 
continuous mathematics. However, a little thought reveals that the task is essentially 
combinatorial. 

Recall that a mixed strategy profile is a NASH equilibrium if the mixed strategy 
of each player is a best response to the mixed strategies of the rest; that is, it attains 
the maximum possibly utility among all possible mixed strategies of this player. The 
following observation is useful here (recall that the support of a mixed strategy is the 
set of all pure strategies that have nonzero probability in it). 


Theorem 2.1 A mixed strategy is a best response if and only if all pure strategies 
in its support are best responses. 


To see why, assume for the sake of contradiction that a best response mixed strategy 
contains in its support a pure strategy that is not itself a best response. Then the utility of 
the player would be improved by decreasing the probability of the worst such strategy 
(increasing proportionally the remaining nonzero probabilities to fill the gap); this 
contradicts the assumption that the mixed strategy was a best response. Conversely, if 
all strategies in all supports are best responses, then the strategy profile combination 
must be a NASH equilibrium. 

This simple fact reveals the subtle nature of a mixed NASH equilibrium: Players 
combine pure best response strategies (instead of using, for the same utility, a single 
pure best response) in order to create for other players a range of best responses that 
will sustain the equilibrium! 

Example 2.2 Consider the symmetric game with two players captured by the 

matrix 


A game with two players can be represented by two matrices (A, B) (hence the 
term bimatrix game often used to describe such games), where the rows of A are 
the strategies of Player 1 and the columns of A are the strategies of Player 2, 
while the entries are the utilities of Player 1; the opposite holds for matrix B. A 
bimatrix game is called symmetric if B = A‘; i.e., the two players have the same 
set of strategies, and their utilities remain the same if their roles are reversed. 

In the above symmetric game, consider the equilibrium in which both play- 
ers play the mixed strategy (0, 1/3, 2/3). This is a symmetric NASH equilibrium, 


' One may object to this aphorism on the basis that in markets agents work in parallel, and are therefore more 
powerful than ordinary algorithms; however, a little thought reveals that parallelism cannot be the cure for 
exponential worst case. 
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because both players play the same mixed strategy. (A variant of NAsH’s proof 
establishes that every symmetric game, with any number of players, has a sym- 
metric equilibrium — it may also have nonsymmetric ones.) We can check whether 
it is indeed an equilibrium, by calculating the utility of each strategy, assuming 
the opponent plays (0, 1/3, 2/3): The utilities are 1 for the first strategy, and 2 
for the other two. Thus, every strategy in the support (1.e., either of strategies 2 
and 3) is a best response, and the mixed strategy is indeed a NASH equilibrium. 
Note that, from Player 1’s point of view, playing just strategy 2, or just strategy 3, 
or any mixture of the two, is equally beneficial to the equilibrium mixed strategy 
(0, 1/3, 2/3). The only advantage of following the precise mix suggested by the 
equilibrium is that it motivates the other player to do the same. 

Incidentally, in our discussion of NASH equilibria in this chapter, we shall often 
use the simpler two-player case to illustrate the ideas. Unfortunately, the main 
result of this section says that two-player games are not, in any significant sense, 
easier than the general problem. 


It also follows from these considerations that finding a mixed NAsuH equilibrium 
means finding the right supports: Once one support for each player has been identified, 
the precise mixed strategies can be computed by solving a system of algebraic equations 
(in the case of two players, linear equations): For each player i we have a number of 
variables equal to the size of the support, call it k;, one equation stating that these 
variables add to 1, and k; — 1 others stating that the k; expected utilities are equal. 
Solving this system of >>, k; equations in }°,k; unknowns yields k; numbers for 
each player. If these numbers are real and nonnegative, and the utility expectation is 
maximized at the support, then we have discovered a mixed NASH equilibrium. 

In fact, if in the two-player case the utilities are integers (as it makes sense to assume 
in the context of computation), then the probabilities in the mixed NAsH equilibrium 
will necessarily be rational numbers, since they constitute the solution of a system of 
linear equations with integer coefficients. This is not true in general: NASH’s original 
paper (1951) includes a beautiful example of a three-player poker game whose only 
NASH equilibrium involves irrational numbers. 

The bottom line is that finding a Nasu equilibrium is a combinatorial problem: It 
entails identifying an appropriate support for each player. Indeed, most algorithms 
proposed over the past half century for finding Nasu equilibria are combinatorial in 
nature, and work by seeking supports. Unfortunately, none of them are known to be 
efficient — to always succeed after only a polynomial number of steps. 


2.2 Is the Nasu Equilibrium Problem NP-Complete? 


Computer scientists have developed over the years notions of complexity, chief among 
them NP-completeness (Garey and Johnson, 1979), to characterize computational prob- 
lems which, just like NASH and SATISFIABILITY,” seem to resist efficient solution. Should 
we then try to apply this theory and prove that NASH is NP-complete? 


? Recall that SATISFIABILITY is the problem that asks, given a Boolean formula in conjunctive normal form, to 
find a satisfying truth assignment. 
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It turns out that NASH is a very different kind of intractable problem, one for which 
NP-completeness is not an appropriate concept of complexity. The basic reason is 
that every game is guaranteed to have a NASH equilibrium. In contrast, in a typical 
NP-complete problem such as SATISFIABILITY, the sought solution may or may not 
exist. NP-complete problems owe much of their difficulty, and their susceptibility to 
NP-completeness reductions, to precisely this dichotomy.’ For, suppose that NASH is 
NP-complete, and there is a reduction from SATISFIABILITY to NASH. This would entail 
an efficiently computable function f mapping Boolean formulae to games, and such 
that, for every formula ¢, ¢ is satisfiable if and only if any NASH equilibrium of f(@) 
satisfies some easy-to-check property II. But now, given any unsatisfiable formula ¢, 
we could guess a NASH equilibrium of f(@), and check that it does not satisfy II: This 
implies NP = coNP! 

Problems such as NASH for which a solution is guaranteed to exist require much 
more specialized and subtle complexity analysis — and the end diagnosis is necessar- 
ily less severe than NP-completeness (see Beame et al., 1998; Johnson et al., 1988; 
Papadimitriou, 1994 for more on this subject). 


2.2.1 NAsH vs Brouwer 


In contemplating the complexity of NASH, a natural first reaction is to look into NASH’s 
proof (1951) and see precisely how existence is established — with an eye towards 
making this existence proof “constructive.” Unfortunately this does not get us very 
far, because NAsH’s proof relies on Brouwer’s fixpoint theorem, stating that every 
continuous function f from the n-dimensional unit ball to itself has a fixpoint: a point 
x such that f(x) = x. NAsu’s proof is a clever reduction of the existence of a mixed 
equilibrium to the existence of such a fixpoint. Unfortunately, Brouwer’s theorem is 
well-known for its nonconstructive nature, and finding a Brouwer fixpoint is known to 
be a hard problem (Hirsch et al., 1989; Papadimitriou, 1994) — again, in the specialized 
sense alluded to above, since a solution is guaranteed to exist here also. 

Natural next question: Is there a reduction in the opposite direction, one establishing 
that NASH is precisely as hard as the known difficult problem of finding a Brouwer fix- 
point? The answer is “yes,” and this is in fact a useful alternative way of understanding 
the main result explained in this chapter.‘ 


2.2.2 NP-Completeness of Generalizations 


As we have discussed, what makes NP-completeness inappropriate for NASH is the 
fact that NASH equilibria always exist. If the computational problem NASH is twisted 


3 But how about the traveling salesman problem? Does it not always have a solution? It does, but this solution 
(the optimum tour) is hard to verify, and so the TSP is not an appropriate comparison here. To be brought into 
the realm of NP-completeness, optimization problems such as the TSP must be first transformed into decision 
problems of the form “given a TSP instance and a bound B, does a tour of length B or smaller exist?” This 
problem is much closer to SATISFIABILITY. 

4 This may seem puzzling, as it seems to suggest that Brouwer’s theorem is also of a combinatorial nature. As 
we shall see, in a certain sense indeed it is. 
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in any one of several simple ways that deprive it from its existence guarantee, NP- 
completeness comes into play almost immediately. 


Theorem 2.3 (Gilboa and Zemel, 1989) The following are NP-complete prob- 
lems, even for symmetric games: Given a two-player game in strategic form, does 
it have 

e at least two Nasu equilibria? 

e@ a Nasu equilibrium in which player I has utility at least a given amount? 


@ a Nasu equilibrium in which the two players have total utility at least a given 
amount? 


e@ a Nasu equilibrium with support of size greater than a given number? 
@ a Nasu equilibrium whose support contains strategy s? 
e a Nasu equilibrium whose support does not contain strategy s? 


e etc., etc. 


A simple proof, due to (Conitzer and Sandholm, 2003), goes roughly as follows: 
Reduction from SATISFIABILITY. It is not hard to construct a symmetric game whose 
strategies are all literals (variables and their negations) and whose NASH equilibria are 
all truth assignments. In other words, if we choose, for each of the n variables, either the 
variable itself or its negation, and play it with probability i then we get a symmetric 
NASH equilibrium, and all NASH equilibria of the game are of this sort. It is also easy to 
add to this game a new pure NASH equilibrium (d, d), with lower utility, where d (for 
“default”) is a new strategy. Then you add new strategies, one for each clause, such 
that the strategy for clause C is attractive, when a particular truth assignment is played 
by the opponent, only if all three literals of C are contradicted by the truth assignment. 
Once a clause becomes attractive, it destroys the assignment equilibrium (via other 
strategies not detailed here) and makes it drift to (d, d). It is then easy to establish that 
the NASH equilibria of the resulting game are precisely (d, d) plus all satisfying truth 
assignments. All the results enumerated in the statement of the theorem, and more, 
follow very easily. 


2.3 The Lemke—Howson Algorithm 


We now sketch the Lemke—Howson algorithm, the best known among the combinatorial 
algorithms for finding a NASH equilibrium (this algorithm is explained in much more 
detail in the next chapter). It works in the case of two-player games, by exploiting 
the elegant combinatorial structure of supports. It constitutes an alternative proof of 
NAsH’s theorem, and brings out in a rather striking way the complexity issues involved 
in solving NASH. Its presentation is much simpler in the case of symmetric games. We 
therefore start by proving a basic complexity result for games: looking at symmetric 
games is no loss of generality. 
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2.3.1 Reduction to Symmetric Games 


Define SYMMETRIC NASH to be the following problem: Given a symmetric game, find 
a symmetric NASH equilibrium. As noted above, NAsH proved in his original paper 
that such equilibrium always exists. Here we establish the following fact, which was 
actually first pointed out before NASH’s paper, in Gale et al., 1950 essentially with the 
same proof, for the case of two-player zero-sum games: 


Theorem 2.4 There is a polynomial reduction from NASH to SYMMETRIC NASH. 


Thus the symmetric case of NASH is as hard as the general one. 

We shall describe the reduction for the two-player case, the proof for any fixed 
number of players being a straightforward generalization. Suppose that we are given 
a two-player game described by matrices A and B; without loss of generality, assume 
that all entries of these matrices are positive (adding the same number to all entries of 
A or B changes nothing). Consider now the symmetric game consisting of this matrix: 
C= ( : T 3) and let (x, y) be a symmetric equilibrium of this game (by x we denote 
the first m components of the vector, where m is the number of rows of A, and by y 
the rest). It is easy to see that, for (x, y) to be a best response to itself, y must be a best 
response to x, and x must be a best response to y. Hence, x and y constitute a NASH 
equilibrium of the original game, completing the proof. 

Incidentally, it is not known how hard it is to find any NASH equilibrium in a 
symmetric game (it could be easier than NAsH), or to find a nonsymmetric equilibrium 
in a symmetric game (it could be easier or harder than NAsH). 


2.3.2 Pivoting on Supports 


So, let us concentrate on finding a NASH equilibrium in a symmetric two-player game 
with n x n utility matrix A, assumed with no loss of generality to have nonnegative 
entries and in addition no column that is totally zero. Consider the convex polytope 
P defined by the 2n inequalities Az < 1, z > O (it turns out that these inequalities 
are important in identifying mixed NAsH equilibria, because, intuitively, when an 
inequality from A;x < 1 is tight, the corresponding strategy is a best response). It is 
a nonempty, bounded polytope (since z = 0 is a solution, and all coefficients of A are 
nonnegative while no column is zero). Let us assume for simplicity that the polytope P 
is also nondegenerate, that is, every vertex lies on precisely n constraints (every linear 
program can be made nondegenerate by a slight perturbation of its coefficients, so this 
is little loss of generality). We say that a strategy 7 is represented at a vertex z if at that 
vertex either z; = 0 or A;z = 1 or both — that is, if at least one of the two inequalities 
of the polytope associated with strategy i is tight at z. 

Suppose that at a vertex z all strategies are represented. This of course could happen 
if z is the all-zero vertex — but suppose it is not. Then for all strategies i with z; > 0 it 
must be the case that A;z = 1. Define now a vector x as follows: 


Zi 


vis ui 


xi= 
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Xp 123 123 X3 


Figure 2.1. The Lemke—Howson algorithm can be thought of as following a directed path in a 
graph. 


Since we assume z + 0, the x;’s are well defined, and they are nonnegative numbers 

adding to 1, thus constituting a mixed strategy. We claim that x is a symmetric NASH 

equilibrium. In proof, just notice that x satisfies the necessary and sufficient condition 

of a NASH equilibrium (recall Theorem): Every strategy in its support is a best response. 
Let us apply this to the symmetric game of Example 2.2, with utility matrix 


03 0 
A=]00 3 
2222 


The polytope P is shown in Figure 2.1; it is nondegenerate because every vertex 
lies on three planes, and has three adjacent vertices. The vertices are labeled by the 
strategies that are represented there (ignore the exponents * for a moment). The only 
vertices where all strategies are represented are the vertex z = (0, 0, 0) and the vertex 
z = (0, 1/6, 1/3) — notice that the latter vertex corresponds to the NAsH equilibrium 
x = (0, 1/3, 2/3). 

So, any vertex of P (other than (0, 0, 0)) at which all strategies are represented is a 
NASH equilibrium. But how do we know that such a vertex exists in general? After all, 
not all choices of n tight constraints result in vertices of a polytope. We shall develop 
a pivoting method for looking for such a vertex. 

Fix a strategy, say strategy n, and consider the set V of all vertices of P at which all 
strategies are represented except possibly for strategy n. This set of vertices is nonempty, 
because it contains vertex (0, 0, 0), so let us start there a path (vp = 0, v1, v2,...) of 
vertices in the set V. Since we assume that P is nondegenerate, there are n vertices 
adjacent to every vertex, and each is obtainable by relaxing one of the tight inequalities 
at the vertex and making some other inequality tight. So consider the n vertices adjacent 
to vo = (0, 0, 0). In one of these vertices, z, is nonzero and all other variables are zero, 
so this new vertex is also in V; call it v,. 
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U 


Figure 2.2. The path cannot cross itself. 


At v, all strategies are represented except for strategy n, and in fact one strategy 
i <n is “represented twice,” in that we have both z; = 0 and C;z = 1. (We represent 
this by i*). By relaxing either of these two inequalities we can obtain two new vertices 
in V adjacent to v;. One of them is vg, the vertex we came from, and the other is bound 
to be some new vertex v2 € V. 

If at vz all strategies are represented, then it is a NASH equilibrium and we are done. 
Otherwise, there is a strategy j that is represented twice at v2, and there are two vertices 
in V that are adjacent to v2 and correspond to these two inequalities. One of these two 
vertices is v; and the other is our new vertex v3, and so on. The path for the example 
of Figure 2.1 where strategy n = 3 is the one that may not be represented, is shown as 
a sequence of bold arrows. 

How can this path end? No vertex v; can be repeated, because repeating v; (see 
Figure 2.2) would mean that there are three vertices adjacent to v; that are obtainable 
by relaxing a constraint associated with its doubly represented strategy, and this is 
impossible (it is also easy to see that it cannot return to 0). And it cannot go on forever, 
since P is a finite polytope. The only place where the process can stop is at a vertex in 
V, other than 0 (a moment’s thought tells us it has to be different from 0) that has no 
doubly represented strategy — that is to say, at a symmetric Nasu equilibrium! 

This completes our description of the Lemke—Howson algorithm, as well as our 
proof of NAsH’s theorem for two-player, nondegenerate games. 


2.4 The Class PPAD 


Let us dissect the existence proof in the previous section. It works by creating a graph. 
The set of vertices of this graph, V, is a finite set of combinatorial objects (vertices of P, 
or sets of inequalities, where all strategies are represented, with the possible exception 
of strategy n). This graph has a very simple “path-like” structure: All vertices have 
either one or two edges incident upon them — because every vertex v € V has either 
one or two adjacent vertices (depending on whether or not strategy n is represented in 
v). The overall graph may be richer than a path — it will be, in general, a set of paths 
and cycles (see Figure 2.3). The important point is that there is definitely at least one 
known endpoint of a path: the all-zero vertex. We must conclude that there is another 
endpoint, and this endpoint is necessarily a NASH equilibrium of the game. 

We must now mention a subtle point: the paths are directed. Looking at a vertex in 
V, we can assign a direction to its incident edge(s), at most one coming in and at most 
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Figure 2.3. A typical problem in PPAD. 


one going out, and do this in a way that is consistent from one vertex to another. In 
our three-dimensional example of Figure 2.1 the rule for asigning directions is simple: 
Going in the direction of the arrow, we should have a face all vertices of which are 
labeled 3 on our right, and a face all vertices of which are labeled 1 on our left. In games 
with more strategies, and thus a polytope of a higher dimension, there is a similar but 
more complicated (and more algebraic) “orientation rule.” So, the graph in the proof 
of NAsH’s Theorem is a directed graph with all outdegrees and indegrees at most one. 

What we mean to say here is that the existence proof of NASH’s theorem (for the two- 
player symmetric, nondegenerate case, even though something similar holds for the 
general case as well) has the following abstract structure: A directed graph is defined on 
a set of nodes that are easily recognizable combinatorial objects (in our case, vertices 
of the polytope where all strategies, with the possible exception of strategy n, are repre- 
sented). Each one of these vertices has indegree and outdegree at most one; therefore, the 
graph is a set of paths and cycles (see Figure 2.3). By necessity there is one vertex with 
no incoming edges and one outgoing edge, called a standard source (in the case of two- 
player NASH, the all-zero vertex). We must conclude that there must be a sink: a NASH 
equilibrium. In fact, not just a sink: notice that a source other than the standard (all-zero) 
one is also a NASH equilibrium, since all strategies are represented there as well. An- 
other important point is that there is an efficient way, given a vertex in the graph to find 
its two adjacent vertices (or decide that there is only one). This can be done by simplex 
pivoting on the doubly represented variable (or on variable n, if it is represented). 

Any such proof suggests a simple algorithm for finding a solution: start from the 
standard source, and follow the path until you find a sink (in the case of two-player 
NASH this is called the Lemke—Howson algorithm). Unfortunately, this is not an efficient 
algorithm because the number of vertices in the graph is exponentially large. Actually, 
in the case of two-player NASH there are examples of games in which such paths are 
exponentially long (Savani and von Stengel, 2004). 
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It turns out that, besides NAsH, there is a host of other computational problems with 
guaranteed existence of solutions, for which existence follows from precisely this type 
of argument: 


¢ A directed graph is defined on a finite but exponentially large set of vertices. 

¢ Each vertex has indegree and outdegree at most one. 

¢ Given a string, it is a computationally easy problem to (a) tell if it is indeed a vertex of 
the graph, and if so to (b) find its neighbors (one or two of them), and to (c) tell which 
one is the predecessor and/or which one is the successor (i.e., identify the direction of 
each edge). 

¢ There is one known source (vertex with no incoming edges) called the “standard source.” 

e Any sink of the graph (a vertex with no outgoing edges), or any source other than the 
standard one, is a solution of the problem. 


One problem whose existence proof has this form is finding an approximate Brouwer 
fixpoint of a function. We omit the precise definition and representation details here; 
a stylized version of this problem is defined in Section 2.6. Another is the following 
problem called HAM SANDWITCH: Given n sets of 2n points each in n dimensions, find 
a hyperplane which, for each of the n sets, leaves n points on each side. There are 
many other such problems (see Papadimitriou, 1994). For none of these problems do 
we know a polynomial algorithm for finding a solution. 

All these problems comprise the complexity class called PPAD.° In other words, 
PPAD is the class of all problems, whose solution space can be set up as the set of 
all sinks and all nonstandard sources in a directed graph with the properties displayed 
above. 

Solving a problem in PPAD is to telescope the long path and arrive at a sink (or 
a nonstandard source), fast and without rote traversal — just as solving a problem in 
NP means narrowing down to a solution among the exponentially many candidates 
without exhaustive search. We do not know whether either of these feats is possi- 
ble in general. But we do know that achieving the latter would imply managing the 
former too. That is, P = NP implies PPAD = P (proof: PPAD is essentially a sub- 
set of NP, since a solution, such as a NASH equilibrium, can be certified quickly if 
found). 

In the case of NP, we have a useful notion of difficulty - NP-completeness — that 
helps characterize the complexity of difficult problems in NP, even in the absence of 
a proof that P 4 NP. A similar manoeuvre is possible and useful in the case of PPAD 
as well. We can advance our understanding of the complexity of a problem such as 
NASH by proving it PPAD-complete — meaning that all other problems in PPAD reduce 
to it. Such a result implies that we could solve the particular problem efficiently if 
and only if a// problems in PPAD (many of which, like BROUWER, are well-known 
hard nuts that have resisted decades of efforts at an efficient solution) can be thus 
solved. 

Indeed, the main result explained in the balance of this chapter is a proof that NASH 
is PPAD-complete. 


> The name, introduced in Papadimitriou (1994), stands for “polynomial parity argument (directed case).” See 
that paper, as well as Beame et al. (1998) and Daskalakis et al. (2006), for a more formal definition. 
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2.4.1 Are PPAD-Complete Problems Hard? 


But why do we think that PPAD-complete problems are indeed hard? PPAD- 
completeness is weaker evidence of intractability than NP-completeness: it could 
very well be that PPAD = P ¥ NP. Yet it is a rather compelling argument for in- 
tractability. If a PPAD-complete problem could be solved in polynomial time, then all 
problems in PPAD (finding Brouwer and Borsuk-Ulam fixpoints, cutting ham sand- 
wiches, finding Arrow-Debreu equilibria in markets, etc., many of which have resisted 
decades of scrutiny, see Papadimitriou (1994) for a more complete list) would also 
be solved. It would mean that any local combinatorial description of a deterministic 
simplex pivoting rule would lead to a novel polynomial algorithm for linear pro- 
gramming. Besides, since it is known (Hirsch et al., 1989) that any algorithm for 
finding Brouwer fixpoints that treats the function as a black box must be exponential, 
PPAD = P would mean that there is a way to find Brouwer fixpoints by delving into 
the detailed properties of the function — a possibility that seems quite counterintu- 
itive. Also, an efficient algorithm for a PPAD-complete problem would have to defeat 
the oracles constructed in Beame et al. (1998) — computational universes in which 
PPAD + P —- and so it would have to be extremely sophisticated in a very specific 
sense. 

In mathematics we must accept as a possibility anything whose negation remains 
unproved. PPAD could very well be equal to P, despite the compelling evidence to the 
contrary outlined above. For all we know, it might even be the case that P = NP — 
in which case PPAD, lying “between” P and NP, would immediately be squeezed 
down to P as well. But it seems a reasonable working hypothesis that neither of these 
eventualities will actually hold, and that by proving a problem PPAD-complete we 
indeed establish it as an intractable problem. 


2.5 Succinct Representations of Games 


Computational problems have inputs, and the input to NASH is a description of the 
game for which we need to find an equilibrium. How long is such a description? 

Describing a game in strategic form entails listing all utilities for all players and 
strategy combinations. In the case of two players, with m and n strategies respectively, 
this amounts to describing 2mn numbers. This makes the two-player case of NASH 
such a very neat and interesting computational problem. 

But we are interested in games because we think that they can model the Internet, 
markets, auctions — and these have far more than two players. Suppose that we have a 
game with n players, and think of n as being in the hundreds or thousands — a rather 
modest range for the contexts and applications outlined above. Suppose for simplicity 
that they all have the same number of strategies, call it s —in any nontrivial game s will 
be at least two. Representing the game now requires ns” numbers! 

This is a huge input. No user can be expected to supply it, and no algorithm to handle 
it. Furthermore, the astronomical input trivializes complexity: If s is a small number 
such as 2 or 5, a trivial efficient algorithm exists: try all combinations of supports. 
But this algorithm is “efficient” only because the input is so huge: For fixed s, (2°)” is 
polynomial in the length of the input, ns”... 
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Conclusion: In our study of the complexity of computational problems for games 
such as NASH we must be especially interested in games with many players; however, 
only succinctly representable multiplayer games can be of relevance and computational 
interest. 

And there are many such games in the literature; we start by describing one of the 
latest arrivals (Kearns et al., 2001) that happens to play a central role in our story. 


2.5.1 Graphical Games 


Suppose that many players are engaged in a complex game; yet, the utility of each 
player depends on the actions of very few other players. That is, there is a directed 
graph ({1,2,...,}, E), with vertices the set of players, and (i, j) € E only if the 
utility of 7 depends on the strategy chosen by i (j’s utility depends, of course, on the 
strategy chosen by /). More formally, for any two strategy profiles s and s’ if s; = 5’, 
and, for all (i, 7) € E we have s; = s/, then u;(s) = u;(s’). A graphical game, as these 
are called, played on a graph with n nodes and indegree at most d, and s choices per 
player, requires only ns¢*! numbers for its description — a huge savings over ns” when 
d is modest. (For more on graphical games, see Chapter 7.) 

For a simple example, consider a directed cycle on 20 players, where the utilities are 
captured by the game matrix A of example 2.2. That is, if a player chooses a strategy 
i € {1, 2, 3} and his predecessor in the cycle chooses another strategy j, then the utility 
of the first player is C;,; (the utility of the predecessor will depend on the strategy 
played by his predecessor). Ordinarily, this game would require 20 x 37° numbers to 
be described; its graph structure reduces this to just a few bytes. 

Can you find a NASH equilibrium in this game? 


2.5.2 Other Succinct Games 


There are many other computationally meaningful ways of representing some interest- 
ing games succinctly. Here are some of the most important ones. 


(i) Sparse games. If very few of the ns” utilities are nonzero, then the input can be 
meaningfully small. Graphical games can be seen as a special case of sparse games, 
in which the sparsity pattern is captured by a graph whose vertices are the players. 

(ii) Symmetric games. In a symmetric game the players are all identical. So, in evaluating 
the utility of a combination of strategies, what matters is how many of the n players 
play each of the s strategies. Thus, to describe such a game we need only so) 
numbers. 

(iii) Anonymous games. This is a generalization of symmetric games, in which each player 
is different, but cannot distinguish between the others, and so again his or her utility 
depends on the partition of the other players into strategies. sn es ') 
here. 

(iv) Extensive form games. These are given as explicit game trees (see the next chapter). 
A strategy for a player is a combination of strategies, one for each vertex in the 
game tree (information set, more accurately, see the next chapter for details) in which 
the player has the initiative. The utility of a strategy combination is that of the leaf 
reached if the strategies are followed. 


numbers suffice 
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(v) Congestion games. These games abstract the network congestion games studied in 
Chapters 18 and 19. Suppose that there are n players, and a set of edges E. The set of 
strategies for each player is a set of subsets of E, called paths. For each edge e € E 
we have a congestion function c, mapping {0, 1, ...,} to the nonnegative integers. 
If the n players choose strategies/paths P = (P;,..., P,,), let the load of edge e, €(P) 
be the size of the set {i : e € P;}. Then the utility of the ith player is ye ek ce(E(P)). 

(vi) There is the even more succinct form of network congestion games, where E is the 
set of edges of an actual graph, and we are given two vertices for each player. The 
strategies available to a player are all simple paths between these two nodes. 

(vii) Local effect games. These are generalizations of the congestion games, see Leyton- 
Brown and Tennenholtz 2003. 
(viii) Facility location games. See Chapter 19. 

(ix) Multimatrix games. Suppose that we have n players with m strategies each, and for 
each pair (i, j) of players an m x m utility matrix A“. The utility of player i for the 
strategy combination 51, ...,5,) is }- isi Au .s; That is, each player receives the total 
sum of his or her interactions with all other players. 


2.6 The Reduction 


In this section we give a brief sketch of the reduction, recently discovered in Daskalakis 
et al. (2006) and Goldberg and Papadimitriou (2006) and extended to two-player games 
in Chen and Deng (2005b), which establishes that NASH is PPAD-complete. 


2.6.1 A PPAD-Complete Problem 


The departure point of the reduction is BROUWER, a stylized discrete version of the 
Brouwer fixpoint problem. It is presented in terms of a function ¢@ from the three- 
dimensional unit cube to itself. Imagine that the unit cube is subdivided into 2°” equal 
cubelets, each of side « = 2~”, and that the function need only be described at all 
cubelet centers. At a cubelet center x, d(x) can take four values: x + 6;,i =0,..., 3, 
where the 46;s are the following tiny displacements mapping the center of the cubelet to 
the center of a nearby cubelet: 5, = (€, 0,0) 62 = (0, €, 0), 63 = (0, 0, €), and finally 
59 = (—e, —e, —e). If x is the center of a boundary cubelet, then we must make sure 
that d(x) does not fall outside the cube — but this is easy to check. We are seeking 
a “fixpoint,” which is defined here to be any internal cubelet corner point such that, 
among its eight adjacent cubelets, all four possible displacements 6;,i = 0,..., 3, are 
present. 

But how is the function ¢@ represented? We assume that @ is given in terms of a 
Boolean circuit, a directed acyclic graph of AND, OR, and NOT gates, with 3 bits as 
inputs (enough to describe the cublet in question) and two bits as outputs (enough to 
specify which one of the four displacements is to be applied). This is a computationally 
meaningful way of representing functions that is quite common in the complexity theory 
literature; any function @ of the sort described above (including the boundary checks) 
can be captured by such a circuit. And this completes the description of BROUWER, our 
starting PPAD-complete problem: Given a Boolean circuit describing @¢, find a fixpoint 
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of ¢. We omit the challenging proof that it is indeed PPAD-complete (see Daskalakis 
et al., 2006). 


2.6.2 The Plan 


But how should we go about reducing this problem to NASH? We shall start by reduc- 
ing BROUWER to an intermediate graphical game with many players. All these players 
have just two strategies, 0 and 1; therefore, we can think of any mixed strategy of a 
player as a number in [0, 1] (the probability he or she assigns to strategy 1). Three 
of these players will be thought of as choosing three numbers that are the coordi- 
nates of a point in the cube. Others will respond by analyzing these coordinates to 
identify the cubelet wherein this point lies, and by computing (by a simulation of the 
circuit) the displacements 6; at the cubelet and adjacent cubelets. The resulting choices 
by the players will incentivize the three original players to change their mixed strategy 
— unless the point is a fixpoint of ¢, in which case the three players will not change 
strategies, and the graphical game will be at a NASH equilibrium! 


2.6.3 The Gadgets 


To carry out this plan, we need certain devices — commonly called “gadgets” in the 
reduction business — for performing basic arithmetic and logical operations. That is, we 
need to define certain small graphical games with players that are considered as inputs 
and another player as output, such that in any NASH equilibrium the mixed strategy of 
the output player (thought of as a real number between 0 and 1) stands in a particular 
arithmetical or logical relation with the inputs (again, thought of as numbers). 

Consider, for example, the multiplication game. It has four players, two input players 
a and b, an output player c, and a middle player d. The underlying directed graph has 
edges (a, d), (b, d), (c, d), (d, c); 1.e., one of these four players affects the utility of 
another if and only if there is an edge in this list from the former to the latter. The players 
have two strategies each, called 0 and 1, so that any mixed strategy profile for a player 
is in fact a real number in [0, 1] (the probability with which the player plays strategy 1). 
The utilities are so constructed that in any NASH equilibrium of this game, the output is 
always the product of the two inputs — all seen as numbers, of course: c = a - b (here 
we use a to represent not just player a, but also its value, i.e., the probability with 
which he plays strategy 1). To specify the game, we need to describe the utilities of 
the output and middle player (the utilities of the inputs are irrelevant since they have 
no incoming edges; this is crucial, because it allows the inputs to be “reused” in many 
gadgets, without one use influencing the others). If the middle player d plays 1 (recall 
that all nodes have two strategies, 1 and 0), then its utility is 1 if both inputs play 1, 
and it is 0 zero otherwise. Thus, if the two input players play 1 with probabilities a and 
b (recall that these are the “values” of the two inputs), and the middle player plays 1, 
then his utility is exactly a - b. If on the other hand the middle player plays 0, then its 
utility is 1 if the output player plays 1, and it is 0 otherwise. Finally, the output player 
gets utility 1 if the middle player plays 1, and —1 if he plays 0. 

Thus, the output player is motivated to play 1 with probability c, which is as high as 
possible, in order to maximize the utility from the middle player’s playing 1 — but not 
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so high that the middle player is tempted to play 0, as he would whenever c > a - b. 
Thus, at equilibrium, c must be exactly a - b, and the multiplication gadget works! 

In a similar manner we can construct gadgets that add and subtract their inputs 
(always within the range [0, 1], of course), or perform certain logical operations. For 
example, it is a trivial exercise to design a gadget with two nodes, an input x and 
an output y, such that y = 1 if x > 5 and y= Oifx< 5 (notice that, importantly, 
the output of this comparator is undetermined is x = 5). It is also easy to design 
gadgets that perform AND, OR, and NOT operations on their inputs (the inputs here 


are assumed to be Boolean, that is to say, pure strategies). 


2.6.4 The Graphical Game 


Using these devices, we can put together a graphical game whose NASH equilibria 
reflect accurately the Brouwer fixpoints of the given function @. 

The graphical game is huge, but has a simple structure: There are three players, called 
the leaders, whose mixed strategies identify a point (x, y, z) in the unit cube. These 
leaders are inputs to a series of comparators and subtractors which extract one by one 
the n most significant bits of the binary representation of x, y, and z, thus identifying 
the cubelet within which the point (x, y, z) lies. A system of logical gadgets could 
then compute the outputs of the given circuit that describes ¢, when the inputs are the 
3n extracted bits, repeat for the neighboring cubelets, and decide whether we are at a 
fixpoint. 

But there is a catch: As we pointed out above, our comparators are “brittle” in that 
they are indeterminate when their input is exactly half. This is of necessity: It can 
be shown (see Daskalakis et al., 2006) that nonbrittle comparators (ones that behave 
deterministically at half) cannot exist! (It turns out that, with such comparators, we 
could construct a graphical game with no NAsH equilibrium ...) This has the effect 
that the computation described above is imprecise (and, in fact, in an unpredictable 
manner) when the point (x, y, z) lies exactly on the boundary of a cubelet, and this can 
create spurious equilibria. We must somehow “smoothen” this discontinuity. 

This is accomplished by a more complicated construction, in which the calculation 
of ¢ is carried out not for the single point (x, y, z) but for a large and very fine grid of 
points around it, with all results averaged. 

Once the average displacement (Ax, Ay, Az) near (x, y, z) has been calculated, its 
components are added to the three leaders, completing the construction of the graphical 
game. This way the loop is closed, and the leaders (who had heretofore no incoming 
edges) are finally affected — very indirectly, of course — by their own choices. We 
must now prove that the NASH equilibria of this game correspond precisely to those 
points in the unit cube for which the average displacement is the zero vector. And 
from this, establish that the average displacement is zero if and only if we are near a 
fixpoint. 


2.6.5 Simulating the Graphical Game by Few Players 


We have already established an interesting result: Finding a NASH equilibrium in a 
graphical game is PPAD-complete. It is even more interesting because the underlying 
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directed graph of the game, despite its size and complexity, has a rather simple 
structure: It is bipartite, and all vertices have indegree three or less. It is bipartite 
because all gadgets are bipartite (the inputs and the outputs are on one side, the middle 
nodes on the other; the logical gadgets can be redesigned to have a middle node as 
well); and the way the gadgets are put together maintains the bipartite property. Finally, 
the middle nodes of the gadget are the ones of maximum indegree — three. 

The challenge now is to simulate this graphical game by one with finitely many 
players. Already in Goldberg and Papadimitriou (2006) and Daskalakis et al. (2006), a 
simulation by four players was shown, establishing that NAsH is PPAD-complete even 
in the four-player case. The idea in the simulation is this: Each of the four players 
“represents” many nodes of the graphical game. How players are represented is best 
understood in terms of a particular undirected graph associated with the graphical 
game, called the conflict graph. This graph is defined on the vertices of the graphical 
game, and has an edge between two nodes u and v if in the graphical game either (a) 
there is an edge between u and v, in either direction, or (b) there are edges from both u 
and v to the same node w. This is the conflict graph of the game; it should be intuitively 
clear that eventualities (a) and (b) make it difficult for the same player to represent both 
u and v, and so coloring the conflict graph and assigning its color classes to different 
players makes sense. The crucial observation is that the conflict graph of the graphical 
game constructed in the reduction is four-colorable. 

So, we can assign to each of four players (think of them as “lawyers”’) all nodes 
(call them “clients”) in a color class. A lawyer’s strategy set if the union of the strategy 
sets of his clients, and so the clients can be represented fairly if the lawyer plays the 
average of their mixed strategies. Since the clients come from a color class of the 
conflict graph, the lawyer can represent them all with no conflict of interest (he or she 
should not represent two players that play against one another, or two players who 
both play against a third one). But there is a problem: A lawyer may neglect some 
clients with small payoffs and favor (in terms of weights in his mixed strategy) the 
more lucrative ones. This is taken care of by having the four lawyers play, on the side, a 
generalization of the “rock-paper-scissors game,” at very high stakes. Since this game 
is known to force the players to distribute their probabilities evenly, all clients will 
now be represented fairly in the lawyer’s mixed strategy; the four-player simulation is 
complete. 

These results, up to the four player simulation, first appeared in the beginning of 
October 2005 (Goldberg and Papadimitriou, 2006; Daskalakis et al., 2006). It was 
conjectured in Daskalakis et al. (2006) that the 3-player case of NASH is also PPAD- 
complete, whereas the 2-player case is in P. Indeed, a few weeks later, two independent 
and very different simulations of the graphical game by three players appeared (Chen 
and Deng, 2005b; Daskalakis and Papadimitriou, 2005) thus proving the first part 
of this conjecture. The proof in Daskalakis and Papadimitriou (2005) was local, and 
worked by modifying the gadgets so that the conflict graph became three-colorable; 
this approach had therefore reached its limit, because for the graphical game to work 
the conflict graph must contain triangles. It was again conjectured in Daskalakis and 
Papadimitriou (2005) that the two-player case can be solved in polynomial time. In 
contrast, the proof in Chen and Deng (2005b) was more ad hoc and nonlocal, and was 
therefore in a sense more open-ended and promising. 
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A month later, a surprisingly simple two-player simulation was discovered (Chen 
and Deng, 2005a), thus establishing that even the two-player case of NASH is PPAD- 
complete! The intuitive idea behind this new construction is that many of the “conflicts 
of interest” captured in the conflict graph (in particular, the (b) case of its definition) hap- 
pen to be unproblematic in this particular game: The two input nodes of a gadget cannot 
effectively “conspire” to improve their lot — and thus they could, in principle, be repre- 
sented by the same (carefully programmed) lawyer. Thus, only two players are needed, 
corresponding to the two sides of the bipartite graphical game. The construction is now 
in fact a little more direct: there is no graph game, and the two players are constructed 
ab initio, with the gadgets, as well as the side game of rock—paper-scissors, built in. 


2.6.6 Approximate Equilibria 


Incidentally, this side game of rock—paper-scissors is the source of another difficulty 
that permeates all these proofs, and which we have not yet discussed: It only guarantees 
that the lawyers approximately balance the interests of their clients; as a result, the 
whole reduction, and the argument at each stage of the construction, must be carried 
out in terms of €-approximate Nasu equilibria. An €-approximate NASH equilibrium is a 
mixed strategy profile such that no other strategy can improve the payoff by more than 
an additive €. (Notice that an €-approximate NAsH equilibrium may or may not be near 
a true NASH equilibrium.) It is easy to see, in retrospect, that this use of approximation 
is inherently needed: Two-player games always have rational NASH equilibria, whereas 
games with more players may have only irrational ones. Any simulation of the latter 
by the former must involve some kind of approximation. 

Now that we know that computing NASH equilibria is an intractable problem, com- 
puting approximate equilibria emerges as a very attractive compromise. But can it 
be done in polynomial time? The reduction described so far shows that it is PPAD- 
complete to compute €-approximate NASH equilibria when € is exponentially small 
(smaller than the side of the cubelet in the initial BROUWER problem, or 2-“” for some 
c > 0, where n is the number of strategies). Starting from an n-dimensional version 
of BROUWER, the result can be strengthened up to an € that is an inverse polynomial, 
(n~°) (Chen et al., 2006). 

There are some positive algorithmic results known for approximate NASH equilib- 
ria: 5-approximate NASH equilibria are very easy to compute in two-player games 


(Daskalakis et al., in press) and an €-approximate NASH equilibrium can be found in 
ee ; ae logn : 
less than exponential time (more specifically, in time n < ) in arbitrary games (see 


Lipton et al., 2003). Discovering polynomial algorithms for computing €-approximate 
NASH equilibria for € between these values — possibly for arbitrarily small constant 
€ > 0—remains an important open problem. 


2.7 Correlated Equilibria 


Consider the symmetric game (often called chicken) with payoffs 


(50) 
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The payoffs are supposed to capture the situation in which two very macho drivers 
speed toward an intersection. Each has two options: Stop or go. There are two pure 
equilibria (me and you) and the symmetric mixed equilibrium (1/2, 1/2). These three 
NASH equilibria create the following three probability distributions on the pure strategy 
0 1) (0 0) (+ 
rofiles: 44 

moe (NCR) | 

Consider however the following distribution: ({ aE It is not a NASH equilibrium; 
in fact, it is easy to see that there are no two fixed strategies for the two players that 
generate this distribution (in algebraic terms, the matrix is not of rank one). However, it 
is a rational outcome of the game, in the following more sophisticated sense: Suppose 
that a trusted third party draws from this distribution, and recommends to each player 
to play according to the outcome. (Coming back to the drivers story, this solution, 
randomizing between (stop, go) and (go, stop) is tantamount to a traffic signal.) If 
the lower left box is chosen, e.g., the recommendation is that Player 1 go and Player 
2 stop (i.e., green light for Player 1). What is remarkable about this distribution of 
recommendations is that it is se/f-enforcing: If either player assumes that the other will 
follow the recommendation, his best bet is to actually follow the recommendation! 

This motivates the following definition (Aumann, 1974): A correlated equilibrium is 
a probability distribution {p,;} on the space of strategy profiles that obeys the following 
conditions: For each player i, and every two different strategies j, j’ of i, conditioned 
on the event that a strategy profile with j as is strategy was drawn from the distribution, 
the expected utility of playing j is no smaller than that of playing j’: 


YD (usj — Usy) Psi = 0. (CE) 


seS_; 


(Naturally, we also require that p; > 0 and >°, p,; = 1.) Here by S_; we denote the 
strategy profiles of all players except for 7; if s € S_;, sj denotes the strategy profile 
in which player i plays j and the others play s. Notice that the inequalities express 
exactly the requirement that, if a strategy profile is drawn from the distribution {p,} 
and each player is told, privately, his or her own component of the outcome, and if 
furthermore all players assume that the others will follow the recommendation, then 
the recommendation is self-enforcing. 

Notice also the following: If p',i = 1,...,n, is a set of mixed strategies of the 
players, and we consider the distribution p, induced by it (p; = |]; pi) then the 
inequalities (CE) state that these mixed strategies constitute a mixed NASH equilibrium! 
Indeed, for each i, j, j’, equation (CE) states in this case that, if 7 is ini’s support, then 
it is a best response. (If strategy j is not in the support, then the inequality becomes a 
tautology, 0 > 0; if it is in the support, then we can divide by its probability the whole 
inequality, and the resulting inequality says that j is best response.) We conclude 
that any NASH equilibrium is a correlated equilibrium. In other words, the correlated 
equilibrium is a generalization of the NASH equilibrium, allowing the probabilities on 
the space of strategy profiles to be correlated arbitrarily. Conversely, NASH equilibrium 
is the special case of correlated equilibrium in which p,’s are restricted to come from 
a product (uncorrelated) distribution. 
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For example, in the drivers game, the (CE) inequalities are as follows: 


(44—S)pit+d—O)pr2=0 
(5 — 4)po + O— 1)pa = 0 
4=S)pi td =Opn 20 
(5 —4)pi2+O-1)p» =0 


A crucial observation now is that the (CE) inequalities are /inear in the unknown 
variables {p,}, and thus the system (CE) can always be solved efficiently by linear 
programming. In fact, we know that these inequalities always have at least one a 
solution: The NASH equilibrium that is guaranteed to exist by NASH’s theorem. 

To restate the situation in terms of our concerns in this chapter, the correlated 
equilibrium is a computationally benign generalization of the intractable NASH equi- 
librium. We can find in polynomial time a correlated equilibrium for any game. In 
fact, we can find the correlated equilibrium that optimizes any linear function of the 
{ps}’s, such as the expected sum of utilities. For example, in the drivers game, we can 
optimize the sum of the players’ expected utilities by maximizing the linear objective 


8pi1 + 6pi2 + 6p21 over the polytope defined by the inequalities above. The optimum 
1 
3 


correlated equilibrium is this: — a traffic light that is red for both one third of 


WIR Wl 


the time. 


2.7.1 Correlated Equilibria vs NasH Equilibria: The Whole Picture 


The polytope defined by the (CE) inequalities in the case of the drivers game is shown 
in Figure 2.4 (the fourth dimension, p22 = 1 — pi; — Pio — p21, iS suppressed in the 
geometric depiction). Every point in this polytope is a correlated equilibrium. There 


are two pure NASH equilibria (N1 and N2) and one symmetric mied one (N3). The 
1 


1 iu 
“traffic light” correlated equilibrium C1 = ({ ;) and the optimum one C2 = (; ;) 
2 3 
are also shown. Notice that the three NASH equilibria are vertices of the polytope. This 
is no coincidence. 


Theorem 2.5 [n any nondegenerate two-player game, the NASH equilibria are 
vertices of the (CE) polytope. 


Naturally, not all vertices of the (CE) polytope will be Nasu equilibria, but at 
least one will be. In other words, in two-player games every NASH equilibrium is the 
optimum correlated equilibrium for some linear function — unfortunately, guessing this 
function is apparently not easy. 

To recapitulate, NASH equilibria are correlated equilibria satisfying the further con- 
straint that they are the product distribution of some pair of mixed strategies. It is 
this single additional constraint that makes the problem of finding a NASH equilibrium 
so much harder. It is apparently a very nonconvex constraint (think of it as a curved 
surface in Figure 2.4, “touching” the (CE) polytope at three of its vertices). In contrast, 
for three or more players there are games in which the NASH equilibria are not vertices 
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Po 


Figure 2.4. The three Nasu equilibria (N1, N2, N3) of the drivers’ game are vertices of the 
polytope of the correlated equilibria. Two other correlated equilibra are shown (C1, C2). 


of the (CE) polytope; e.g., it is easy to see that any game with integer utilities that has 
only irrational NASH equilibria must be of this sort. 


2.7.2 Correlated Equilibria in Succinct Games 


But as we observed in Section 2.5, polynomial-time algorithms whose input is a 
game, such as the linear programming algorithm for finding correlated equilibria, 
make a mockery of complexity theory when the number of players is reasonably high. 
This brings us to the following important question: Can we find correlated equilibria 
efficiently when the game is represented succinctly? 

There are some very interesting — and very natural — “learning” algorithms for ap- 
proximating correlated equilibria, reviewed in Chapter 4 of this book. These algorithms 
work by simulating repeated play of the game, in which the various players change 
their strategies according to how much they “regret” previous decisions. Certain so- 
phisticated ways of doing this are guaranteed to reach a point that is quite close to 
the (CE) polytope. To arrive at a distance €, from the (CE) polytope, 4 iterations are 
required, where c is some small constant depending on the particular method. But the 
question remains, can we find a point of the (CE) polytope in polynomial time? 

Recently, there have been some interesting results on this question; to state them we 
need to introduce some definitions. We say that a succinctly representable game is of 
polynomial type if the number of players, as well as the number of strategies of each 
player, in a game represented by a string of length n is always bounded by a polynomial 
inn. For such a game, the expected utility problem is this: Calculate the expected utility 
of each player, if for each player i the given mixed strategy p' played. It turns out 
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that solving this problem is enough for the correlated equilibrium problem to be 
solved: 


Theorem 2.6 (Papadimitriou, 2005) Jn any succinctly representable game of 
polynomial type for which the expected utility problem can be solved in polynomial 
time, the problem of finding a correlated equilibrium can be solved in polynomial 
time as well. Consequently, there is a polynomial-time algorithm (polynomial in 
the length of the description of the game) for finding a correlated equilibrium 
in sparse, Symmetric, anonymous, graphical, congestion, local effect, facility 
location, and multimatrix games (among many others, recall the definitions in 
Section 2.5). 


But how about the slightly more demanding problem of finding, not just any corre- 
lated equilibrium, but the one that optimizes a given linear objective of the probabilities? 
A much less sweeping result is available here. 


Theorem 2.7 (Papadimitriou and Roughgarden, 2005) The problem of opti- 
mizing a linear function over correlated equilibria can be solved in polynomial 
time for symmetric games, anonymous games, and graphical games for which the 
underlying graph is of bounded treewidth. 


In contrast, it is NP-hard to find the optimum-correlated equilibrium in gen- 
eral graphical games and congestion games, among others (Papadimitriou and 
Roughgarden, 2005). 


2.8 Concluding Remarks 


The computational complexity of equilibrium concepts deserves a central place in 
game theoretic discourse. The proof, outlined in this chapter, that finding a mixed 
NASH equilibrium is PPAD-complete raises some interesting questions regarding the 
usefulness of the NASH equilibrium, and helps focus our interest in alternative notions 
(most interesting among them the approximate NASH equilibrium discussed in the end 
of Section 2.6). 

But there are many counterarguments to the importance of such a negative com- 
plexity result. It only shows that it is hard to find a NAsH equilibrium in some very 
far-fetched, artificial games that happen to encode Brouwer functions. Of what rele- 
vance can such a result be to economic practice? 

The same can be said (and has been said, in the early days) about the NP- 
completeness of the traveling salesman problem, for example. And the answer remains 
the same: The PPAD-completeness of NASH suggests that any approach to finding 
NASH equibria that aspires to be efficient, as well as any proposal for using the concept 
in an applied setting, should explicitly take advantage of computationally beneficial 
special properties of the games in hand, by proving positive algorithmic results for 
interesting classes of games. On the other hand (as has often been the case with NP- 
completeness, and as it has started to happen here as well; Abbott et al., 2005; Codenotti 
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et al., 2006), PPAD-completeness proofs will be eventually refined to cover simpler 
and more realistic-looking classes of games. And then researchers will strive to identify 
even simpler classes. 

An intractability result such as the one outlined in this chapter should be most 
usefully seen as the opening move in an interesting game. 
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CHAPTER 3 


Equilibrium Computation for 
Two-Player Games in Strategic 
and Extensive Form 


Bernhard von Stengel 


Abstract 


We explain algorithms for computing Nash equilibria of two-player games given in strategic form or 
extensive form. The strategic form is a table that lists the players’ strategies and resulting payoffs. 
The “best response” condition states that in equilibrium, all pure strategies in the support of a 
mixed strategy must get maximal, and hence equal, payoff. The resulting equations and inequalities 
define polytopes, whose “completely labeled” vertex pairs are the Nash equilibria of the game. The 
Lemke—Howson algorithm follows a path of edges of the polytope pair that leads to one equilibrium. 
Extensive games are game trees, with information sets that model imperfect information of the players. 
Strategies in an extensive game are combinations of moves, so the strategic form has exponential 
size. In contrast, the linear-sized sequence form of the extensive game describes sequences of moves 
and how to randomize between them. 


3.1 Introduction 


A basic model in noncooperative game theory is the strategic form that defines a game 
by a set of strategies for each player and a payoff to each player for any strategy profile 
(which is a combination of strategies, one for each player). The central solution concept 
for such games is the Nash equilibrium, a strategy profile where each strategy is a best 
response to the fixed strategies of the other players. In general, equilibria exist only 
in mixed (randomized) strategies, with probabilities that fulfill certain equations and 
inequalities. Solving these constraints is an algorithmic problem. Its computational 
complexity is discussed in Chapter 2. 

In this chapter, we describe methods for finding equilibria in sufficient detail to 
show how they could be implemented. We restrict ourselves to games with two players. 
These can be studied using polyhedra, because a player’s expected payoffs are linear 
in the mixed strategy probabilities of the other player. Nash equilibria of games with 
more than two players involve expected payoffs that are products of the other players’ 
probabilities. The resulting polynomial equations and inequalities require different 
approaches. 
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For games in strategic form, we give the basic “best response condition” (Prop. 3.1, 
see Section 3.2), explain the use of polyhedra (Section 3.3), and describe the Lemke— 
Howson algorithm that finds one Nash equilibrium (Section 3.4). An implementation 
without numerical errors uses integer pivoting (Section 3.5). “Generic” games (i.e., 
“almost all” games with real payoffs) are nondegenerate (see Definition 3.2); degenerate 
games are considered in Section 3.5. 

An extensive game (defined in Section 3.7) is a fundamental model of dynamic 
interactions. A game tree models in detail the moves available to the players and 
their information over time. The nodes of the tree represent game states. An in- 
formation set is a set of states in which a player has the same moves, and does 
not know which state he is in. A player’s strategy in an extensive game specifies a 
move for each information set, so a player may have exponentially many strategies. 
This complexity can be reduced: Subgames (see Section 3.8) are subtrees so that all 
players know they are in the subgame. Finding equilibria inductively for subgames 
leads to subgame perfect equilibria, but this reduces the complexity only if play- 
ers are sufficiently often (e.g., always) informed about the game state. The reduced 
strategic form applies to general games (see Section 3.9), but may still be expo- 
nential. A player has perfect recall if his information sets reflect that he remembers 
his earlier moves. Players can then randomize locally with behavior strategies. This 
classic theorem (Corollary 3.12) is turned into an algorithm with the sequence form 
(Sections 3.10 and 3.11) which is a strategic description that has the same size as the 
game tree. 

We give in this chapter an exposition of the main ideas, not of all earliest or latest 
developments of the subject. Section 3.12 summarizes the main references. Further 
research is outlined in Section 3.13. 


3.2 Bimatrix Games and the Best Response Condition 


We use the following notation throughout. Let (A, B) be a bimatrix game, where A and 
B are m x n matrices of payoffs to the row player 1 and column player 2, respectively. 
This is a two-player game in strategic form (also called “normal form’), which is 
played by a simultaneous choice of a row i by player 1 and column j by player 2, who 
then receive payoff a;; and b;;, respectively. The payoffs represent risk-neutral utilities, 
so when facing a probability distribution, the players want to maximize their expected 
payoff. These preferences do not depend on positive-affine transformations, so that A 
and B can be assumed to have nonnegative entries, which are rationals, or more simply 
integers, when A and B define the input to an algorithm. 

All vectors are column vectors, so an m-vector x is treated as an m x | matrix. 
A mixed strategy x for player 1 is a probability distribution on rows, written as an 
m-vector of probabilities. Similarly, a mixed strategy y for player 2 is an n-vector of 
probabilities for playing columns. The support of a mixed strategy is the set of pure 
strategies that have positive probability. A vector or matrix with all components zero 
is denoted by 0, a vector of all ones by 1. Inequalities like x > 0 between two vectors 
hold for all components. B' is the matrix B transposed. 
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Let M be the set of the m pure strategies of player 1 and let N be the set of the n 
pure strategies of player 2. It is useful to assume that these sets are disjoint, as in 


M=(l,...,m}, N={m+1,...,m+n}. (3.1) 


Then x € R™ and y € R’, which means, in particular, that the components of y are 
y; for j € N. Similarly, the payoff matrices A and B belong to R“*". 

A best response to the mixed strategy y of player 2 is a mixed strategy x of player 1 
that maximizes his expected payoff x' Ay. Similarly, a best response y of player 2 to 
x maximizes her expected payoff x ' By. A Nash equilibrium is a pair (x, y) of mixed 
strategies that are best responses to each other. The following proposition states that 
a mixed strategy x is a best response to an opponent strategy y if and only if all pure 
strategies in its support are pure best responses to y. The same holds with the roles of 
the players exchanged. 


Proposition 3.1 (Best response condition) Let x and y be mixed strategies of 
player 1 and 2, respectively. Then x is a best response to y if and only if for all 
ie M, 


x >O0 = (Ay); =u =max{ (Ay); | k € M}. (3.2) 


PROOF (Ay); is the ith component of Ay, which is the expected payoff to 
player 1 when playing row i. Then 


x" Ay =) x (Ay) = Do xi — (Ay) =u — Dx Ge — (Ay). 


ieM icM ieM 


Sox! Ay < u because x; > 0 and u — (Ay); > 0 for alli ¢ M, and x! Ay =u if 
and only if x; > 0 implies (Ay); = u, as claimed. 


Proposition 3.1 has the following intuition: Player 1’s payoff x' Ay is linear in x, 
so if it is maximized on a face of the simplex of mixed strategies of player 1, then it is 
also maximized on any vertex (i.e., pure strategy) of that face, and if it is maximized 
on a set of vertices then it is also maximized on any convex combination of them. 
The proposition is useful because it states a finite condition, which is easily checked, 
about all pure strategies of the player, rather than about the infinite set of all mixed 
strategies. It can also be used algorithmically to find Nash equilibria, by trying out 
the different possible supports of mixed strategies. All pure strategies in the support 
must have maximum, and hence equal, expected payoff to that player. This leads to 
equations for the probabilities of the opponent’s mixed strategy. 

As an example, consider the 3 x 2 bimatrix game (A, B) with 


3° 3 3.2 
A=]2 5], B=|2 6}. (3.3) 
0 6 3h 


This game has only one pure-strategy Nash equilibrium, namely the top row (numbered 
1 inthe pure strategy set M = {1, 2, 3} of player 1), together with the left column (which 
by (3.1) has number 4 in the pure strategy set N = {4,5} of player 2). A pure strategy 
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equilibrium is given by mixed strategies of support size 1 each, so here it is the mixed 
strategy pair ((1,0,0)', (1, 0)'). 

The game in (3.3) has also some mixed equilibria. Any pure strategy of a player 
has a unique pure best response of the other player, so in any other equilibrium, each 
player must mix at least two pure strategies to fulfill condition (3.2). In particular, 
player 2 must be indifferent between her two columns. If the support of player 1’s 
mixed strategy x is {1,2}, then player 1 can make player 2 indifferent by x, = 4/5, 
x2 = 1/5, which is the unique solution to the equations x; + x. = 1 and (for the two 
columns of B) 3x; + 2x2 = 2x; + 6x2. In turn, (3.2) requires that player 2 plays with 
probabilities y, and ys so that player 1 is indifferent between rows | and 2, i.e., 
3y4 + 3y5 = 2y4+5y5 or (y4, V5) = (2/3, 1/3). The vector of expected payoffs to 
player 1 is then Ay = (3, 3, 2)' so that (3.2) holds. 

A second mixed equilibrium is (x, y) = ((0, 1/3, 2/3)", (1/3, 2/3)') with expected 
payoff vectors x' B = (8/3, 8/3) and Ay = (3, 4, 4)'. Again, the support of x contains 
only pure strategies i where the corresponding expected payoff (Ay); is maximal. 

A third support pair, {1, 3}, for player 1, does not lead to an equilibrium, for two 
reasons. First, player 2 would have to play y = (1/2, 1/2)! to make player 1 indifferent 
between row | and row 3. But then Ay = (3, 7/2, 3)', so that rows 1 and 3 give the 
same payoff to player 1 but not the maximum payoff for all rows. Secondly, making 
player 2 indifferent via 3x; + 3x3 = 2x; + x3 has the solution x; = 2, x3 = —1 in 
order to have x; + x3 = 1, so x is not a vector of probabilities. 

In this “support testing” method, it normally suffices to consider supports of equal 
size for the two players. For example, in (3.3) it is not necessary to consider a mixed 
strategy x of player 1 where all three pure strategies have positive probability, because 
player 1 would then have to be indifferent between all these. However, a mixed strategy 
y of player 1 is already uniquely determined by equalizing the expected payoffs for 
two rows, and then the payoff for the remaining row is already different. This is the 
typical, “nondegenerate” case, according to the following definition. 


Definition 3.2. A two-player game is called nondegenerate if no mixed strategy 
of support size k has more than k pure best responses. 


In a degenerate game, Definition 3.2 is violated, for example, if there is a pure strat- 
egy that has two pure best responses. For the moment, we consider only nondegenerate 
games, where the player’s equilibrium strategies have equal sized support, which is 
immediate from Proposition 3.1: 


Proposition 3.3. In any Nash equilibrium (x, y) of a nondegenerate bimatrix 
game, x and y have supports of equal size. 


The “support testing” algorithm for finding equilibria of a nondegenerate bimatrix 
game then works as follows. 


Algorithm 3.4 (Equilibria by support enumeration) /nput: A nondegenerate 
bimatrix game. Output: All Nash equilibria of the game. Method: For each k = 
1,...,min{m, n} and each pair (/, J) of k-sized subsets of M and N, respectively, 
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solve the equations ));., x;bi; = v for j € J, 0,-, x) = 1, ier aij yj = u, for 
iel, eer y; = 1, and check that x > 0, y => 0, and that (3.2) holds for x and 
analogously y. 


The linear equations considered in this algorithm may not have solutions, which then 
mean no equilibrium for that support pair. Nonunique solutions occur only for degen- 
erate games, because a linear dependency allows to reduce the support of a mixed 
strategy. Degenerate games are discussed in Section 3.6 below. 


3.3 Equilibria via Labeled Polytopes 


To identify the possible supports of equilibrium strategies, one can use “best response 
polytopes” that express directly the inequalities of best responses and nonnegative 
probabilities. 

We first recall some notions from the theory of (convex) polyhedra. An affine 
combination of points z,,..., Z% in some Euclidean space is of the form ar ZAis 
where Aj, ..., Ax are reals with aa A; = 1.Itis called a convex combination if 4; > 0 
for all i. A set of points is convex if it is closed under forming convex combinations. 
Given points are affinely independent if none of these points are an affine combination 
of the others. A convex set has dimension d if and only if it has d + 1, but no more, 
affinely independent points. 

A polyhedron P in R¢ isa set {z € R¢ | Cz < q} for some matrix C and vector gq. It 
is called full-dimensional if it has dimension d. It is called a polytope if it is bounded. 
A face of P isaset {z € P | c'z = qo} for some c € R4, qo € R so that the inequality 
c!z < qoholds for all z in P. A vertex of P is the unique element of a zero-dimensional 
face of P. An edge of P is a one-dimensional face of P. A facet of a d-dimensional 
polyhedron P is a face of dimension d — 1. It can be shown that any nonempty face 
F of P can be obtained by turning some of the inequalities defining P into equalities, 
which are then called binding inequalities. That is, F = {ze P|cjz=q, i € I}, 
where c;z < q; fori € I are some of the rows in Cz < q. A facet is characterized by 
a single binding inequality which is irredundant; i.e., the inequality cannot be omitted 
without changing the polyhedron. A d-dimensional polyhedron P is called simple if 
no point belongs to more than d facets of P, which is true if there are no special 
dependencies between the facet-defining inequalities. 

The “best response polyhedron” of a player is the set of that player’s mixed strategies 
together with the “upper envelope” of expected payoffs (and any larger payoffs) to the 
other player. For player 2 in the example (3.3), it is the set Q of triples (4, ys, u) that 
fulfill 3y4 + 3y5 < u,2y4+5y5 < u,Oy4 + 6y5 < u, yg > 0, v5 > O, and y4+ ys = 1. 
The first three inequalities, in matrix notation Ay < 1u, say that u is at least as large 
as the expected payoff for each pure strategy of player 1. The other constraints y > 0 
and 1' y = | state that y is a vector of probabilities. The best response polyhedron P 
for player 1 is defined analogously. Generally, 


(x,v)€ RY x R|x>0, 1'x =1, Bx <1}, 


3.4 
(y,u)€ RY xR| Ay <1u, y>0, I'y=1}. oo 


P={ 
Q={ 
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V4 


0 1 


Figure 3.1. Best reponse polyhedron Q for strategies of player 2, and corresponding poly- 
tope Q, which has vertices 0, p, q, r,s. 


The left picture in Figure 3.1 shows oO for our example, for 0 < y4 < 1, which uniquely 
determines ys as 1 — y4. The circled numbers indicate the facets of Q, which are either 
the strategies i € M of the other player 1 or the own strategies j € N. Facets 1, 2, 3 of 
player 1 indicate his best responses together with his expected payoff u. For example, 
1 is a best response when yy > 2/3. Facets 4 and 5 of player 2 tell when the respective 
own strategy has probability zero, namely y4 = 0 or ys = 0. 

We say a point (y, u) of Q has label k € M U N if the kth inequality in the definition 
of Q is binding, which for k =i € M is the ith binding inequality » jen Gij Vj =U 
(meaning i is a best response to y), or for k = j € N the binding inequality y; = 0. 
In the example, (4, ys, wu) = (2/3, 1/3, 3) has labels 1 and 2, so rows 1 and 2 are 
best responses to y with expected payoff 3 to player 1. The labels of a point (x, v) 
of P are defined correspondingly: It has label i €¢ M if x; = 0, and label j € N if 
View bijXi = v. With these labels, an equilibrium is a pair (x, y) of mixed strategies 
so that with the corresponding expected payoffs v and u, the pair ((x, v), (y, w)) in 
P x Q is completely labeled, which means that every label k € M U N appears as a 
label either of (x, v) or of (y, u). This is equivalent to the best response condition (3.2): 
A missing label would mean a pure strategy of a player, e.g., i of player 1, that does not 
have probability zero, so x; > 0, and is also not a best response, since )> jen Uj Yj <u, 
because the respective inequality i is not binding in P or Q. But this is exactly when 
the best response condition is violated. Conversely, if every label appears in P or Q, 
then each pure strategy is a best response or has probability zero, so x and y are mutual 
best responses. 

The constraints (3.4) that define P and Q can be simplified by eliminating the payoff 
variables u and v, which works if these are always positive. For that purpose, assume 
that 


A and B' are nonnegative and have no zero column. (3.5) 
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Figure 3.2. The best response polytopes P (with vertices 0, a, b, c, d, e) and Q for the game 
in (3.3). The arrows describe the Lemke—Howson algorithm (see Section 3.4). 


We could simply assume A > 0 and B > 0, but it is useful to admit zero matrix entries 
(e.g., as in the identity matrix); even negative entries are possible as long as the upper 
envelope remains positive, e.g., for a34 (currently zero) in (3.3), as Figure 3.1 shows. 

For P, we divide each inequality View bijXi SV by v, which gives 
Le uw bij i/v) < 1, treat x;/v as a new variable that we call again x;, and call the 
resulting polyhedron P. Similarly, Q is replaced by Q by dividing each inequality in 
Ay < 1u by u. Then 

P={xeR”| x>0, B'x <1), 
(3.6) 
Q={yeR”|Ay<1, y20}. 
It is easy to see that (3.5) implies that P and Q are full-dimensional polytopes, unlike 
P and Q. In effect, we have normalized the expected payoffs to be 1, and dropped the 
conditions 1'x = 1 and 1'y = 1. Nonzero vectors x € P and y € Q are multiplied by 
v =1/1'x and u = 1/1' y to turn them into probability vectors. The scaling factors v 
and u are the expected payoffs to the other player. 

The set P is in one-to-one correspondence with P — {0} with the map (x, v) x - 
(1/v). Similarly, (y, uv) + y - (1/u) defines a bijection Q > Q — {0}. These bijections 
are not linear, but are known as “projective transformations” (for a visualization see von 
Stengel, 2002, Fig. 2.5). They preserve the face incidences since a binding inequality in 
P (respectively, Q) corresponds to a binding inequality in P (respectively, Q) and vice 
versa. In particular, points have the same /abels defined by the binding inequalities, 
which are some of the m + n inequalities defining P and Q in (3.6). An equilibrium 
is then a completely labeled pair (x, y) € P x Q — {(0, 0)}, which has for each label 
i € M the respective binding inequality in x > 0 or Ay < 1, and for each j € N the 
respective binding inequality in B'x < lor y > 0. 

For the example (3.3), the polytope Q is shown on the right in Figure 3.1 and in 
Figure 3.2. The vertices y of Q, written as y', are (0, 0) with labels 4,5, vertex p = 
(0, 1/6) with labels 3, 4, vertex g = (1/12, 1/6) with labels 2, 3, vertex r = (1/6, 1/9) 
with labels 1,2, and s = (1/3, 0) with labels 1,5. The polytope P is shown on the 
left in Figure 3.2. Its vertices x are 0 with labels 1, 2,3, and (written as x') vertex 
a = (1/3, 0, 0) with labels 2, 3, 4, vertex b = (2/7, 1/14, 0) with labels 3, 4, 5, vertex 
c = (0, 1/6, 0) with labels 1, 3,5, vertex d = (0, 1/8, 1/4) with labels 1, 4,5, and 
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e = (0, 0, 1/3) with labels 1, 2, 4. Note that the vectors alone show only the “own” 
labels as the unplayed own strategies; the information about the other player’s best 
responses is important as well. The following three completely labeled vertex pairs 
define the Nash equilibria of the game, which we already found earlier: the pure 
strategy equilibrium (a, 5), and the mixed equilibria (b, r) and (d, q). The vertices c 
and e of P, and p of Q, are not part of an equilibrium. 

Nondegeneracy of a bimatrix game (A, B) can be stated in terms of the polytopes 
P and Q in (3.6) as follows: no point in P has more than m labels, and no point in Q 
has more than n labels. (If x € P and x has support of size k and L is the set of labels 
of x, then |L 1 M| = m —k, so |L| > m implies x has more than k best responses in 
LN.) Then P and Q are simple polytopes, because a point of P, say, that is on more 
than m facets would have more than m labels. Even if P and Q are simple polytopes, the 
game can be degenerate if the description of a polytope is redundant in the sense that 
some inequality can be omitted, but nevertheless is sometimes binding. This occurs 
if a player has a pure strategy that is weakly dominated by or payoff equivalent to 
some other mixed strategy. Nonsimple polytopes or redundant inequalities of this kind 
do not occur for “generic” payoffs; this illustrates the assumption of nondegeneracy 
from a geometric viewpoint. (A strictly dominated strategy may occur generically, 
but it defines a redundant inequality that is never binding, so this does not lead to a 
degenerate game.) 

Because the game is nondegenerate, only vertices of P can have m labels, and only 
vertices of Q can have n labels. Otherwise, a point of P with m labels that is not a 
vertex would be on a higher dimensional face, and a vertex of that face, which is a 
vertex of P, would have additional labels. Consequently, only vertices of P and Q 
have to be inspected as possible equilibrium strategies. 


Algorithm 3.5 (Equilibria by vertex enumeration) Jnput: A nondegenerate 
bimatrix game. Output: All Nash equilibria of the game. Method: For each vertex 
x of P — {0}, and each vertex y of Q — {0}, if (x, y) is completely labeled, output 
the Nash equilibrium (x -1/1'x, y-1/1'y). 


Algorithm 3.5 is superior to the support enumeration Algorithm 3.4 because there are 
more supports than vertices. For example, if m =n, then approximately 4” possible 
support pairs have to be tested, but P and Q have less than 2.6” many vertices, 
by the “upper bound theorem” for polytopes. This entails less work, assuming that 
complementary vertex pairs (x, y) are found efficiently. 

Enumerating all vertices of a polytope P, say, is a standard problem in computional 
geometry. The elegant /rs (lexicographic reverse search) algorithm considers a known 
vertex, like 0 for P in (3.6), and a linear objective function that, over P, is maximized 
at that vertex, like the function x +> —1' x. For any vertex of P, the simplex algorithm 
with a unique pivoting rule (e.g., Bland’s least-index rule for choosing the entering 
and leaving variable) then generates a unique path to 0, defining a directed tree on the 
vertices of P with root 0. The algorithm explores that tree by a depth-first search from 
0 which “reverts” the simplex steps by considering recursively for each vertex x of P 
the edges to vertices x’ so that the simplex algorithm pivots from x’ to x. 
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3.4 The Lemke—Howson Algorithm 


Algorithms 3.4 and 3.5 find all Nash equilibria of a nondegenerate bimatrix game 
(A, B). In contrast, the Lemke—Howson (for short LH) algorithm finds one Nash 
equilibrium, and provides an elementary proof that Nash equilibria exist. The LH 
algorithm follows a path (called LH path) of vertex pairs (x, y) of P x Q, for the 
polytopes P and Q defined in (3.6), that starts at (0, 0) and ends at a Nash equilibrium. 

An LH path alternately follows edges of P and Q, keeping the vertex in the other 
polytope fixed. Because the game is nondegenerate, a vertex of P is given by m labels, 
and a vertex of Q is given by n labels. An edge of P is defined by m — 1 labels. For 
example, in Figure 3.2 the edge defined by labels 1 and 3 joins the vertices 0 and c. 
Dropping a label | of a vertex x of P, say, means traversing the unique edge that has 
all the labels of x except for /. For example, dropping label 2 of the vertex 0 of P 
in Figure 3.2 gives the edge, defined by labels 1 and 3, that joins 0 to vertex c. The 
endpoint of the edge has a new label, which is said to be picked up, so in the example 
label 5 is picked up at vertex c. 

The LH algorithm starts from (0,0) in P x Q. This is called the artificial equi- 
librium, which is a completely labeled vertex pair because every pure strategy has 
probability zero. It does not represent a Nash equilibrium of the game because the zero 
vector cannot be rescaled to a mixed strategy vector. An initial free choice of the LH 
algorithm is a pure strategy k of a player (any label in M U N), called the missing label. 
Starting with (x, y) = (0, 0), label & is dropped. At the endpoint of the corresponding 
edge (of P if k e M, of Q if k © N), the new label that is picked up is duplicate 
because it was present in the other polytope. That duplicate label is then dropped in the 
other polytope, picking up a new label. If the newly picked label is the missing label, 
the algorithm terminates and has found a Nash equilibrium. Otherwise, the algorithm 
repeats by dropping the duplicate label in the other polytope, and continues in this 
fashion. 

In the example (3.3), suppose that the missing label is k = 2. The polytopes P and 
Q are shown in Figure 3.2. Starting from 0 in P, label 2 is dropped, traversing the edge 
from 0 to vertex c, which is the set of points x of P that have labels 1 and 3, shown 
by an arrow in Figure 3.2. The endpoint c of that edge has label 5 which is picked up. 
At the vertex pair (c, 0) of P x Q, all labels except for the missing label 2 are present, 
so label 5 is now duplicate because it is both a label of c and of 0. The next step is 
therefore to drop the duplicate label 5 in Q, traversing the edge from 0 to vertex p 
while keeping c in P fixed. The label that is picked up at vertex p is 3, which is now 
duplicate. Dropping label 3 in P defines the unique edge defined by labels 1 and 5, 
which joins vertex c to vertex d. At vertex d, label 4 is picked up. Dropping label 4 
in Q means traversing the edge of Q from p to q. At vertex q, label 2 is picked up. 
Because 2 is the missing label, the current vertex pair (d, q) is completely labeled, and 
it is the Nash equilibrium found by the algorithm. 

In terms of the game, the first two LH steps amount to taking a pure strategy (given 
by the missing label k, say of player 1) and considering its best response, say j, which 
defines a pure strategy pair (k, /). If this is not already an equilibrium, the best response 
i to j is not k, so that i is a duplicate label, and is now given positive probability in 
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addition to k. In general, one possibility is that a duplicate label is a new best response 
which in the next step gets positive probability, as in this case. Alternatively, the 
duplicate label is a pure strategy whose probability has just become zero, so that it no 
longer needs to be maintained as a best response in the other polytope and the path 
moves away from the best response facet. 


Algorithm 3.6 (Lemke-Howson) /nput: Nondegenerate bimatrix game. Out- 
put: One Nash equilibrium of the game. Method: Choose k € M UN, called the 
missing label. Let (x, y) = (0,0) € P x Q. Drop label k (from x in P if k € M, 
from y in Q if k € N). Loop: Call the new vertex pair (x, y). Let J be the label 
that is picked up. If / = k, terminate with Nash equilibrium (x, y) (rescaled as 
mixed strategy pair). Otherwise, drop / in the other polytope and repeat. 


The LH algorithm terminates, and finds a Nash equilibrium, because P x Q has 
only finitely many vertex pairs. The next vertex pair on the path is always unique. 
Hence, a given vertex pair cannot be revisited because that would provide an additional 
possibility to proceed in the first place. 

We have described the LH path for missing label k by means of alternating edges 
between two polytopes. In fact, it is a path on the product polytope P x Q, given by 
the set of pairs (x, y) of P x Q that are k-almost completely labeled, meaning that 
every label in M U N — {k} appears as a label of either x or y. In Figure 3.2 for k = 2, 
the vertex pairs on the path are (0, 0), (c, 0), (c, p), (d, p), (d, q). 

For a fixed missing label k, the k-almost completely labeled vertices and edges of the 
product polytope P x Q forma graph of degree 1 or 2. Clearly, such a graph consists of 
disjoints paths and cycles. The endpoints of the paths are completely labeled. They are 
the Nash equilibria of the game and the artificial equilibrium (0, 0). Since the number 
of endpoints of the paths is even, we obtain the following. 


Corollary 3.7 A nondegenerate bimatrix game has an odd number of Nash 
equilibria. 


The LH algorithm can start at any Nash equilibrium, not just the artificial equilib- 
rium. In Figure 3.2 with missing label 2, starting the algorithm at the Nash equilibrium 
(d, q) would just generate the known LH path backward to (0, 0). When started at the 
Nash equilibrium (a, s), the LH path for the missing label 2 gives the vertex pair (b, s), 
where label 5 is duplicate, and then the equilibrium (b, r). This path cannot go back 
to (0, 0) because the path leading to (0, 0) starts at (d, q). This gives the three Nash 
equilibria of the game as endpoints of the two LH paths for missing label 2. 

These three equilibria can also be found by the LH algorithm by varying the missing 
label. For example, the LH path for missing label 1 in Figure 3.2 leads to (a, s), from 
which (b, r) is subsequently found via missing label 2. 

However, some Nash equilibria can remain elusive to the LH algorithm. An example 
is the following symmetric 3 x 3 game with 


3 3 0 
A=B'=|4 0 1}. (3.7) 
0 4 5 
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Every Nash equilibrium (x, y) of this game is symmetric, i.e., x = y, where x! is 
(0, 0, 1), (1/2, 1/4, 1/4), or (3/4, 1/4, 0). Only the first of these is found by the LH 
algorithm, for any missing label; because the game is symmetric, it suffices to consider 
the missing labels 1, 2,3. (A symmetric game remains unchanged when the players 
are exchanged; a symmetric game has always a symmetric equilibrium, but may also 
have nonsymmetric equilibria, which obviously come in pairs.) 


3.5 Integer Pivoting 


The LH algorithm follows the edges of a polyhedron, which is implemented alge- 
braically by pivoting as used by the simplex algorithm for solving a linear program. We 
describe an efficient implementation that has no numerical errors by storing integers of 
arbitrary precision. The constraints defining the polyhedron are thereby represented as 
linear equations with nonnegative s/ack variables. For the polytopes P and Q in (3.6), 
these slack variables are nonnegative vectors s ¢ R“ andr € R™ so that x € P and 
y € Qif and only if 


B'x+s=1, r+ Ay =1, (3.8) 
and 
x>0, s>0, r>=0, yO. (3.9) 


A binding inequality corresponds to a zero slack variable. The pair (x, y) is completely 
labeled if and only if x;r; = 0 for alli ¢ M and y;s; = 0 forall j ¢ N, which by (3.9) 
can be written as the orthogonality condition 


x'r =0, y's =0. (3.10) 


A basic solution to (3.8) is given by n basic (linearly independent) columns of 
B'x +s =1and m basic columns of r + Ay = 1, where the nonbasic variables that 
correspond to the m respectively n other (nonbasic) columns are set to zero, so that the 
basic variables are uniquely determined. A basic feasible solution also fulfills (3.9), 
and defines a vertex x of P and y of Q. The labels of such a vertex are given by the 
respective nonbasic columns. 

Pivoting is achange of the basis where a nonbasic variable enters and a basic variable 
leaves the set of basic variables, while preserving feasibility (3.9). We illustrate this for 
the edges of the polytope P in Figure 3.2 shown as arrows, which are the edges that 
connect 0 to vertex c, and c to d. The system B' x + s = 1 is here 


3x1 2x. + 3x3 + 54 =1 


(3.11) 


2x1 6 | x2 x3 +s55=1 


and the basic variables in (3.11) are s4 and ss, defining the basic feasible solution s4 = 1 
and s5 = 1, which is simply the right-hand side of (3.11) because the basic columns 
form the identity matrix. Dropping label 2 means that x2 is no longer a nonbasic 
variable, so x2 enters the basis. Increasing x2 while maintaining (3.11) changes the 
current basic variables as s4 = 1 — 2x2, s5 = 1 — 6x2, and these stay nonnegative as 
long as x2 < 1/6. The term 1/6 is the minimum ratio, over all rows in (3.11) with 
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positive coefficients of the entering variable x2, of the right-hand side divided by the 
coefficient. (Only positive coefficients bound the increase of x2, which applies to at 
least one row since the polyhedron P is bounded.) The minimum ratio test determines 
uniquely ss as the variable that leaves the basis, giving the label 5 that is picked up in 
that step. The respective coefficient 6 of x2 is indicated by a box in (3.11), and is called 
the pivot element, its row is the pivot row and its column is the pivot column. 

Algebraically, pivoting is done by applying row operations to (3.11) so that the new 
basic variable x2 has a unit column, so that the basic solution is again given by the 
right-hand side. Integer pivoting is a way to achieve this while keeping all coefficients 
of the system as integers; the basic columns then form an identity matrix multiplied by 
an integer. To that end, all rows (which in (3.11) is only the first row) except for the 
pivot row are multiplied with the pivot element, giving the intermediate system 


18x, 12x 18x3 + 654 = 6 
2x1 6x2 x3 +s55=1 


(3.12) 


Then, suitable multiples of the pivot row are subtracted from the other rows to obtain 
zero entries in the pivot column, giving the new system 


14x; + |16|x3 + 654 — 255 = 4 
2x; + 6x2. + x3 + ss =. 


(3.13) 


In (3.13), the basic columns for the basic variables s4 and x2 form the identity matrix, 
multiplied by 6 (which is pivot element that has just been used). Clearly, all matrix 
entries are integers. The next step of the LH algorithm in the example is to let ys be the 
entering variable in the system r + Ay = 1, which we do not show. There, the leaving 
variable is r3 (giving the duplicate label 3) so that the next entering variable in (3.13) 
is x3. The minimum ratio test (which can be performed using only multiplications, 
not divisions) shows that among the nonnegativity constraints 654 = 4 — 16x3 > 0 and 
6x2 = 1 — x3 => 0, the former is tighter so that s4 is the leaving variable. The pivot 
element, shown by a box in (3.13), is 16, with the first row as pivot row. 

The integer pivoting step is to multiply the other rows with the pivot element, giving 


14x, 16x3 + 654 255 = 4 
32x, + 96x. + 16x3 + 16s5 = 16. 


(3.14) 


Subsequently, a suitable multiple of the pivot row is subtracted from each other row, 
giving the new system 


14x; + 16x3 + 6584 2s5 = 4 
18x; + 96x. — 654 + 18s5 = 12 


(3.15) 


with x3 and x, as basic variables. However, except for the pivot row, the unchanged 
basic variables have larger coefficients than before, because they have been multiplied 
with the new pivot element 16. The second row in (3.15) can now be divided by the 
previous pivot element 6, and this division is integral for al// coefficients in that row; 
this is the key feature of integer pivoting, explained shortly. The new system is 


14x, + 16x3 + 654 — 255 = 4 
3x; + 16x2 — s4 + 385 = 2. 


(3.16) 
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This is the final system because the duplicate label 4 (given by the variable s4 that has 
just left) is dropped in Q, where the missing label 2 is picked up. The basic solution in 
(3.16) is vertex d of P with x3 = 4/16, x. = 2/16, and labels (given by the nonbasic 
columns) 1, 4, and 5. 

Integer pivoting, as illustrated in this example, always maintains an integer matrix 
(or “tableau”’) of coefficients of a system of linear equations that is equivalent to the 
original system B'x +s = 1, in the form 


CB'x+Cs=Cl. (3.17) 


In (3.17), C is the inverse of the basis matrix given by the basic columns of the original 
system, multiplied by the determinant of the basis matrix (which is 6 in (3.13), and 
16 in (3.16)). The matrix C is given by the (integer) cofactors of the basis matrix; the 
cofactor of a matrix entry is the determinant of the matrix when the row and column 
of that element are deleted. Each entry in (3.17) has a bounded number of digits (by at 
most a factor of n log compared to the original matrix entries), so integer pivoting is 
a polynomial-time algorithm. It is also superior to using fractions of integers (rational 
numbers) because their cancelation requires greatest common divisor computations 
that take the bulk of computation time. Only the final fractions defining the solution, 
like x3 = 4/16 and x. = 2/16 in (3.16), may have to be canceled. 


3.6 Degenerate Games 


The uniqueness of an LH path requires a nondegenerate game. In a degenerate game, a 
vertex of P, for example, may have more than m labels. When that vertex is represented 
as a basic feasible solution as in (3.17) this means that not only the m nonbasic variables 
are zero, but also at least one basic variable. Such a degenerate basic feasible solution 
results from a pivoting step where the leaving variable (representing the label that is 
picked up) is not unique. 

As an example, consider the 3 x 2 game 


3 3 3°. 3 
A=|2 5], B= |2 6], (3.18) 
0 6 3 ak 


which agrees with (3.3) except that bj; = 3. The polytope Q for this game is the same 
as before, shown on the right in Figure 3.2. The polytope P is the convex hull of the 
original vertices 0, a, c, d, e shown on the left in Figure 3.2, so vertex b has merged 
with a. The new facets of P with labels 4 and 5 are triangles with vertices a, d, e and 
a,c, d, respectively. 

In this example (3.18), the first step of the LH path for missing label 1 would be 
from (0, 0) to (a, 0), where the two labels 4 and 5 are picked up, because vertex a 
has the four labels 2,3, 4,5 due to the degeneracy. If then label 4 is dropped in Q, 
the algorithm finds the equilibrium (a, s) and no problem occurs. However, dropping 
label 5 in Q would mean a move to (a, p) where label 3 is picked up, and none of the 
two edges of P that move away from the facet with label 3 (which are the edges from 
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a to d and from a to e) would, together with p, be 1-almost completely labeled, so the 
algorithm fails at this point. 

Degeneracy can be resolved by perturbing the linear system lexicographically, 
which is well known from linear programming. Assume that the system B'x +5 = 


1, say, is changed to the perturbed system B'x +s =1+(e!,...,e")'. After any 
number of pivoting steps, this system has the form 
CB'x+Cs =C1+C(e',...,6")' (3.19) 


for some invertible matrix C. The corresponding unperturbed basic feasible solution 
may have a zero basic variable, which is a row of C1, but for sufficiently small ¢ > 0 it 
is positive if and only if in that row the first nonzero entry of the matrix C is positive; this 
is the invariant maintained by the algorithm, using a more general “lexico-minimum” 
ratio test. No actual perturbance is required, and C is already stored in the system as 
the matrix of coefficients of s, as seen from (3.19). 

Degenerate games may have infinite sets of equilibria. In the example (3.18), vertex 
a of P, which represents the pure strategy (1,0,0)' of player 1, together with the 
entire edge that joins vertices r and s of Q, defines a component of Nash equilibria, 
where player 2 plays some mixed strategy (v4, 1 — ys) for2/3 < y4 < 1. However, this 
equilibrium component is a convex combination of the “extreme” equilibria (a, r) and 
(a, s). In general, even in a degenerate game, the Nash equilibria can be described in 
terms of pairs of vertices of P and Q. We write conv U for the convex hull of a set U. 


Proposition 3.8 Let (A, B) be a bimatrix game, and (x, y) € P x Q. Then 
(x, y) (rescaled) is a Nash equilibrium if and only if there is a set U of vertices of 
P — {0} and a set V of vertices of Q — {0} so that x € conv U and y € conv V, 
and every (u,v) € U x V is completely labeled. 


Proposition 3.8 holds because labels are preserved under convex combinations, and 
because every face of P or Q has the labels of its vertices, which are vertices of the 
entire polytope; for details see von Stengel (2002, Thm. 2.14). 

The following algorithm, which extends Algorithm 3.5, outputs a complete descrip- 
tion of all Nash equilibria of a bimatrix game: Define a bipartite graph on the vertices 
of P — {0} and Q — {0}, whose edges are the completely labeled vertex pairs (x, y). 
The “cliques” (maximal complete bipartite subgraphs) of this graph of the form U x V 
then define sets of Nash equilibria conv U x conv V whose union is the set of all Nash 
equilibria. These sets are called “maximal Nash subsets.” They may be nondisjoint, 
if they contain common points (x, y). The connected unions of these sets are usually 
called the (topological) components of Nash equilibria. 


3.7 Extensive Games and Their Strategic Form 


A game in strategic form is a “static” description of an interactive situation, where play- 
ers act simultaneously. A detailed “dynamic” description is an extensive game where 
players act sequentially, where some moves can be made by a chance player, and where 
each player’s information about earlier moves is modeled in detail. Extensive games are 
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Figure 3.3. Left: A game in extensive form. Top right: Its strategic form payoff matrices A and B. 
Bottom right: Its sequence form payoff matrices A and B, where rows and columns correspond 
to the sequences of the players which are marked at the side. Any sequence pair not leading 
to a leaf has matrix entry zero, which is left blank. 


a fundamental representation of dynamic interactions which generalizes other models 
like repeated and multistage games, or games with incomplete information. 

The basic structure of an extensive game is a directed tree. The nodes of the tree 
represent game states. Trees (rather than general graphs) are used because then a game 
state encodes the full history of play. Only one player moves at any one state along 
a tree edge. The game starts at the root (initial node) of the tree and ends at a leaf 
(terminal node), where each player receives a payoff. The nonterminal nodes are called 
decision nodes. A player’s possible moves are assigned to the outgoing edges of the 
decision node. 

The decision nodes are partitioned into information sets. All nodes in an information 
set belong to the same player, and have the same moves. The interpretation is that when 
a player makes a move, he only knows the information set but not the particular node 
he is at. In a game with perfect information, all information sets are singletons (and 
can therefore be omitted). We denote the set of information sets of player i by Hj, 
information sets by h, and the set of moves at h by Cy. 

Figure 3.3 shows an example of an extensive game. Moves are marked by upper-case 
letters for player 1 and by lowercase letters for player 2. Information sets are indicated 
by ovals. The two information sets of player 1 have move sets {L, R} and {S, T}, and 
the information set of player 2 has move set {/, 7}. A play of the game may proceed 
by player 1 choosing L, player 2 choosing r, and player 1 choosing S, after which the 
game terminates with payoffs 5 and 6 to players 1 and 2. By definition, move S of 
player 1 is the same, no matter whether player 2 has chosen / or r, because player 1 
does not know the game state in his second information set. 

At some decision nodes, the next move may be a chance move. Chance is here 
treated as an additional player 0, who receives no payoff and who plays according to 
a known behavior strategy. A behavior strategy of player i is given by a probability 
distribution on C;, for all / in H;. (The information sets belonging to the chance player 
are singletons.) A pure strategy is a behavior strategy where each move is picked 
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deterministically. A pure strategy of player 7 can be regarded as an element (cy)pcx, of 
Thex, Cy, that is, as a tuple of moves, like (L, S$) for player 1 in Figure 3.3. 

Tabulating all pure strategies of the players and recording the resulting expected 
payoffs defines the strategic form of the game. In Figure 3.3, the strategic form of the 
extensive game is shown at the top right, with payoff matrices A and B to player 1 and 
player 2. 

Given the strategic form, a player can play according to a mixed strategy, which is 
a probability distribution on pure strategies. The player chooses a pure strategy, which 
is a complete plan of action, according to this distribution, and plays it in the game. 
In contrast, a behavior strategy can be played by “delaying” the random move until 
the player reaches the respective information set. It can be considered as a special 
mixed strategy since it defines a probability for every pure strategy, where the moves 
at information sets are chosen independently. 

We consider algorithms for finding Nash equilibria of an extensive game, with the 
tree together with the described game data as input. The strategic form is bad for this 
purpose because it is typically exponentially large in the game tree. As described in 
the subsequent sections, this complexity can be reduced, in some cases by considering 
subgames and corresponding subgame perfect equilibria. The reduced strategic form of 
the game is smaller but may still be exponentially large. A reduction from exponential 
to linear size is provided by the sequence form, which allows one to compute directly 
behavior strategies rather than mixed strategies. 


3.8 Subgame Perfect Equilibria 


A subgame of an extensive game is a subtree of the game tree that includes all infor- 
mation sets containing a node of the subtree. Figure 3.3 has a subgame starting at the 
decision node of player 2; the nodes in the second information set of player 1 are not 
roots of subgames because player 1 does not know that he is in the respective subtree. 
In the subgame, player 2 moves first, but player 1 does not get to know that move. 
So this subgame is equivalent to a 2 x 2 game in strategic form where the players act 
simultaneously. (In this way, every game in strategic form can be represented as a game 
in extensive form.) 

The subgame in Figure 3.3 has a unique mixed equilibrium with probability 2/3 for 
the moves T and r, respectively, and expected payoff 4 to player 1 and 8/3 to player 2. 
Replacing the subgame by the payoff pair (4, 8/3), one obtains a very simple game 
with moves L and R for player 1, where L is optimal. So player 1’s mixed strategy 
with probabilities 1/3 and 2/3 for (L, S) and (L, T) and player 2’s mixed strategy 
(1/3, 2/3) for J, r define a Nash equilibrium of the game. This is the, here unique, 
subgame perfect equilibrium of the game, defined by the property that it induces a 
Nash equilibrium in every subgame. 


Algorithm 3.9 (Subgame perfect equilibrium) = /nput: An extensive game. 
Output: A subgame perfect Nash equilibrium of the game. Method: Consider, 
in increasing order of inclusion, each subgame of the game, find a Nash equilib- 
rium of the subgame, and replace the subgame by a new terminal node that has 
the equilibrium payoffs. 
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In a game with perfect information, every node is the root of a subgame. Then Algo- 
rithm 3.9 is the well-known, linear time backward induction method, also sometimes 
known as “Zermelo’s algorithm.” Because the subgame involves only one player in 
each iteration, a deterministic move is optimal, which shows that any game with perfect 
information has a (subgame perfect) Nash equilibrium where every player uses a pure 
strategy. 

In games with imperfect information, a subgame perfect equilibrium may require 
mixed strategies, as Figure 3.3 demonstrates. 


3.9 Reduced Strategic Form 


Not all extensive games have nontrivial subgames, and one may also be interested 
in equilibria that are not subgame perfect. In Figure 3.3, such an equilibrium is the 
pure strategy pair ((R, S),/). Here, player 2 is indifferent between her moves / and r 
because the initial move R of player 1 means that player 2 never has to make move / 
or r, so player 2 receives the constant payoff 3 after move R. If play actually reached 
player 2’s information set, move / would not be optimal against S, which is why this is 
not a subgame perfect equilibrium. Player 2 can, in fact, randomize between / and r, 
and as long as / is played with probability at least 2/3, (R, S) remains a best response 
of player 1, as required in equilibrium. 

In this game, the pure strategies (R, S) and (R, 7) of player 1 are overspecific 
as “plans of action”: the initial move R of player 1 makes the subsequent choice 
of S or T irrelevant since player 1’s second information set cannot be reached after 
move R. Consequently, the two payoff rows for (R, S) and (R, T) are identical for both 
players. In the reduced strategic form, moves at information sets that cannot be reached 
because of an earlier own move are identified. In Figure 3.3, this reduction yields the 
pure strategy (more precisely, equivalence class of pure strategies) (R, *), where * 
denotes an arbitrary move. The two (reduced as well as unreduced) pure strategies of 
player 2 are her moves / and r. 

The reduced strategic form of Figure 3.3 corresponds to the bimatrix game (3.18) if 
(R, *) is taken as the first strategy (top row) of player 1. This game is degenerate even 
if the payoffs in the extensive game are generic, because player 2, irrespective of her 
own move, receives constant payoff 3 when player 1 chooses (R, *). 

Once a two-player extensive game has been converted to its reduced strategic form, 
it can be considered as a bimatrix game, where we refer to its rows and columns as the 
“pure strategies” of player 1 and 2, even if they leave moves at unreachable information 
sets unspecified. 

The concept of subgame perfect equilibrium requires fully specified strategies, 
rather than reduced strategies. For example, it is not possible to say whether the Nash 
equilibrium ((R, *),/) of the reduced strategic form of the game in Figure 3.3 is 
subgame perfect or not, because player 1’s behavior at his second information set is 
unspecified. This could be said for a Nash equilibrium of the full strategic form with 
two rows (R, S) and (R, T). However, these identical two rows are indistinguishable 
computationally, so there is no point in applying an algorithm to the full rather than the 
reduced strategic form, because any splitting of probabilities between payoff-identical 
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strategies would be arbitrary. If one is interested in finding subgame perfect equilibria, 
one should use Algorithm 3.9. At each stage of that algorithm, the considered games 
have by definition no further subgames, and equilibria of these games can be found 
using the reduced strategic form or the sequence form. 

A player may have parallel information sets that are not distinguished by own 
earlier moves. These arise when a player receives information about an earlier move by 
another player. Combinations of moves at parallel information sets cannot be reduced, 
which causes a multiplicative growth of the number of reduced strategies. In general, 
the reduced strategic form can therefore still be exponential in the size of the game tree. 


3.10 The Sequence Form 


In the reduced strategic form, pure strategies are only partially specified, by omitting 
moves at information sets that cannot be reached because of an own earlier move. In 
the sequence form, pure strategies are replaced by an even more partial description 
of sequences which specify a player’s moves only along a path in the game tree. The 
number of these paths, and therefore of these sequences, is bounded by the number 
of nodes of the tree. However, randomizing between such sequences can no longer be 
described by a single probability distribution, but requires a system of linear equations. 

A sequence of moves of player i is the sequence of his moves (disregarding the 
moves of other players) on the unique path from the root to some node ¢ of the tree, and 
is denoted o;(t). For example, for the leftmost leaf ¢ in Figure 3.3 this sequence is LS 
for player 1 and / for player 2. The empty sequence is denoted @. Player i has perfect 
recall if and only if o;(s) = o;(t) for any nodes s, t € h and h € H;. Then the unique 
sequence o;(t) leading to any node ¢ in h will be denoted oy. Perfect recall means that 
the player cannot get additional information about his position in an information set 
by remembering his earlier moves. We assume all players have perfect recall. 

Let 6; be a behavior strategy of player i. The move probabilities 6;(c) fulfill 


d= Bic) = 1, Bi(c)>0 forhe H;, ce Cy. (3.20) 


cECn 


The realization probability of a sequence o of player i under §; is 


Bilol = |] Bo. (3.21) 
cino 
Aninformation set / in H; is called relevant under f; if B;[o,] > 0, otherwise irrelevant, 
in agreement with irrelevant information sets as considered in the reduced strategic 
form. 
Let S$; be the set of sequences of moves for player i. Then any o in §; is either the 
empty sequence or uniquely given by its last move c at the information set h in H;, 
that is, o = o;,c. Hence, 


S; = {G6} U {onc |h € H,, ceECy}. 


This implies that the number of sequences of player i, apart from the empty sequence, 
is equal to his total number of moves, that is, |S;| = 1+ nen, |C;,|. This number is 
linear in the size of the game tree. 
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Let 6, and 6 denote behavior strategies of the two players, and let By be the known 
behavior of the chance player. Let a(t) and b(t) denote the payoffs to player 1 and 
player 2, respectively, at a leaf ¢ of the tree. The probability of reaching t is the product 
of move probabilities on the path to t. The expected payoff to player 1 is therefore 


ye a(t) Boloo(t)] Piloi(t)] P2lo2(t)] , (3.22) 


leaves ¢ 


and the expected payoff to player 2 is the same expression with b(t) instead of a(t). 
However, the expected payoff is nonlinear in terms of behavior strategy probabilities 
Bi(c) since the terms 6;[0;(t)] are products by (3.21). 

Therefore, we consider directly the realization probabilities 6;[o0] as functions of 
sequences o in S;. They can also be defined for mixed strategies 1; of player i, 
which choose each pure strategy 2; of player i with probability j;(7;). Under z;, the 
realization probability of o in S; is z;[o], which is equal to 1 if 2; prescribes all the 
moves in o and zero otherwise. Under ju;, the realization probability of o is 


ilo] = >> milm)milo). (3.23) 


Tj 


For player 1, this defines a map x from S$, to R by x(o0) = [0] for o € S;. We call 
x the realization plan of j1; or a realization plan for player 1. A realization plan for 
player 2, similarly defined on Sz by a mixed strategy j12, is denoted y. Realization 
plans have two important properties. 


Proposition 3.10 =A realization plan x of a mixed strategy of player 1 fulfills 
x(o) > Ofor alla € S, and 


x(D) = 1, y x(OnC) =x(o,) forallh € Hy. (3.24) 


cECy 


Conversely, any x: Sj — R with these properties is the realization plan of a 
behavior strategy of player 1, which is unique except at irrelevant information 
sets. A realization plan y of player 2 is characterized analogously. 


For the second property, two mixed strategies are called realization equivalent if 
they reach any node of the tree with the same probabilities, given any strategy of the 
other player. We can assume that all chance probabilities Bo(c) are positive, by pruning 
any tree branches that are unreached by chance. 


Proposition 3.11. Two mixed strategies 4; and \’, of player i are realization 
equivalent if and only if they have the same realization plan, that is, w;{o] = ;[o] 
forallo € §;. 


These two propositions (to be proved in Exercise 3.13) imply the well-known 
result by Kuhn (1953) that behavior strategies are strategically as expressive as mixed 
strategies. 


Corollary 3.12 (Kuhn’s theorem) For a player with perfect recall, any mixed 
strategy is realization equivalent to a behavior strategy. 
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Proposition 3.10 characterizes realization plans by nonnegativity and the equations 
(3.11). A realization plan describes a behavior strategy uniquely except for the moves 
at irrelevant information sets. In particular, the realization plan of a pure strategy (that 
is, a realization plan with values 0 or 1) is as specific as a reduced pure strategy. 

A realization plan represents all the relevant strategic information of a mixed strategy 
by Proposition 3.11. This compact information is obtained with the linear map in (3.23). 
This map assigns to any mixed strategy ;, regarded as a tuple of mixed strategy 
probabilities j1;(z;), its realization plan, regarded as a tuple of realization probabilities 
L4i|o] for o in S;. The simplex of mixed strategies is thereby mapped to the polytope of 
realization plans defined by the linear constraints in Proposition 3.10. The vertices of 
this polytope are the realization plans of pure strategies. The number of these vertices 
may be exponential. However, the number of defining inequalities and the dimension 
of the polytope is linear in the tree size. For player i, this dimension is the number 
|S;| of variables minus the number 1 + || of equations (3.24) (which are linearly 
independent), so it is },c7,(1Cn| — D. 

We consider realization plans as vectors in x € R'*'! and y € R'*!, that is, 
xX = (Xo)ces, Where x, = x(o), and similarly y = (y,)res,. The linear constraints in 
Proposition 3.10 are denoted by 


Ex=e, x>0 and Fy=f, y>0, (3.25) 


using the constraint matrices E and F and vectors e and f. The matrix E and right- 
hand side e have 1 + || rows, and F has |S;| columns. The first row denotes the 
equation x(4) = 1 in (3.24). The other rows for h € Hj are the equations —x(o;,) + 
decec, *(Gne) = 0. 

In Figure 3.3, the sets of sequences are S$; = {4, L, R, LS, LT} and S2 = {G,/, r}, 
and in (3.25), 


E=|-1 11 ene F=|_j ; HE ari 
-1 11 0 


Each sequence appears exactly once on the left-hand side of the equations (3.24), 
accounting for the entry 1 in each column of E and F. The number of information sets 
and therefore the number of rows of EF and F is at most linear in the size of the game 
tree. 

Define the sequence form payoff matrices A and B, each of dimension |S;| x | Sol, 
as follows. Foro € S; and t € Sp, let the matrix entry a,, of A be defined by 


Ge x a(t) Boloo(t)] (3.26) 


leaves t : o1(t)=o0, 02(t)=t 


The matrix entry of B is this term with b instead of a. An example is shown on the 
bottom right in Figure 3.3. These two matrices are sparse, since the matrix entry for a 
pair o, t of sequences is zero (the empty sum) whenever these sequences do not lead 
to a leaf. If they do, the matrix entry is the payoff at the leaf (or leaves, weighted with 
chance probabilities of reaching the leaves, if there are chance moves). Then by (3.22), 
the expected payoffs to players 1 and 2 are x' Ay and x! By, respectively, which is 
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just another way of writing the weighted sum over all leaves. The constraint and payoff 
matrices define the sequence form of the game. 


3.11 Computing Equilibria with the Sequence Form 


Realization plans in the sequence form take the role of mixed strategies in the strategic 
form. In fact, mixed strategies x and y are a special case, by letting FE and F in (3.25) 
be single rows 1' and e = f = 1. The computation of equilibria with the sequence 
form uses linear programming duality, which is also of interest for the strategic form. 

Consider a fixed realization plan y of player 2. A best response x of player 1 is a 
realization plan that maximizes his expected payoff x '(Ay). That is, x is a solution to 
the linear program (LP) 


maximize x'(Ay) subject to Ex =e, x >0. (3.27) 


This LP has a dual LP with a vector u of unconstrained variables whose dimension is 
1 + ||, the number of rows of E. This dual LP states 


minimize e | u subject to Elu> Ay. (3.28) 


Both LPs have feasible solutions, so by the strong duality theorem of linear program- 
ming, they have the same optimal value. 

Consider now a zero-sum game, where B = —A. Player 2, when choosing y, has 
to assume that her opponent plays rationally and maximizes x! Ay. This maximum 
payoff to player 1 is the optimal value of the LP (3.27), which is equal to the optimal 
value e'u of the dual LP (3.28). Player 2 is interested in minimizing e ' u by her choice 
of y. The constraints of (3.28) are linear in u and y even if y is treated as a variable. 
So a minmax realization plan y of player 2 (minimizing the maximum amount she has 
to pay) is a solution to the LP 

minimize e'u subjectto Fy = f, E'u—Ay>0, y>0. (3.29) 
The dual of this LP has variables v and x corresponding to the primal constraints 
Fy = f and E'u — Ay > 0, respectively. It has the form 


maximize flv subject to Ex =e, Flv—A'x <0, x>0. (3.30) 
U,Xx 


It is easy to verify that this LP describes the problem of finding a maxmin realization 
plan x (with maxmin payoff f 'v) for player 1. 

This implies, first, that any zero-sum game has an equilibrium (x, y). More impor- 
tantly, given an extensive game, the number of nonzero entries in the sparse matrices 
E, F, A, and the number of variables, is linear in the size of the game tree. Hence, we 
have shown the following. 


Theorem 3.13 The equilibria of a two-person zero-sum game in extensive form 
with perfect recall are the solutions to the LP (3.29) with sparse payoff matrix A 
in (3.26) and constraint matrices E and F in (3.25) defined by Prop. 3.10. The 
size of this LP is linear in the size of the game tree. 
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A best response x of player 1 against the mixed strategy y of player 2 is a solution 
to the LP (3.27). This is also useful for games that are not zero-sum. By strong duality, 
a feasible solution x is optimal if and only if there is a dual solution w fulfilling 
E'u> Ay and x'(Ay) = e'u, that is, x'(Ay) = (x' E')u or equivalently 


x'(E'u— Ay) =0. (3.31) 


Because the vectors x and E'u— Ay are nonnegative, (3.31) states that they are 
complementary in the sense that they cannot both have positive components in the same 
position. This characterization of an optimal primal-dual pair of feasible solutions is 
known as complementary slackness in linear programming. For the strategic form, this 
condition is equivalent to the best response condition (3.2). 

For player 2, the realization plan y is a best response to x if and only if it maximizes 
(x! B)y subject to Fy = f, y > 0. The dual of this LP has the vector v of variables and 
says: minimize f'v subject to F'v > B'x. Here, a primal-dual pair y, v of feasible 
solutions is optimal if and only if, analogous to (3.31), 


y'(Flvu—B'x)=0. (3.32) 


Considering these conditions for both players, this shows the following. 


Theorem 3.14 Consider the two-person extensive game with sequence form 
payoff matrices A, B and constraint matrices E, F. Then the pair (x, y) of re- 
alization plans defines an equilibrium if and only if there are vectors u,v so 
that 


Ex=e, x>0, Fy=f, y>0, (3.33) 
E'u—Ay>0, Flv—B'x>0 ; 
and (3.31), (3.32) hold. The size of the matrices E, F, A, B is linear in the size 
of the game tree. 


The conditions (3.33) define a linear complementarity problem (LCP). For a game 
in strategic from, (3.8), (3.9), and (3.10) define also an LCP, to which the LH algorithm 
finds one solution. For a general extensive game, the LH algorithm cannot be applied 
to the LCP in Theorem 3.14, because u and v are not scalar dual variables that 
are easily eliminated from the system. Instead, it is possible to use a variant called 
Lemke’s algorithm. Similar to the LH algorithm, it introduces a degree of freedom 
to the system, by considering an additional column for the linear equations and a 
corresponding variable zo which is initially nonzero, and which allows for an initial 
feasible solution where x = 0 and y = 0. Then a binding inequality in r = E'u — 
Ay > 0(ors = F'v — B'x > 0) means that a basic slack variable r, (or s;) can leave 
the basis, with x, (respectively, y,) entering, while keeping (3.10). Like in the LH 
algorithm, this “complementary pivoting rule” continues until an equilibrium is found, 
here when the auxiliary variable zo leaves the basis. 
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3.12 Further Reading 


A scholarly and more comprehensive account of the results of this chapter is von 
Stengel (2002). The best response condition (Proposition 3.1) is due to Nash (1951). 
Algorithm 3.4 is folklore, and has been used by Dickhaut and Kaplan (1991). Polyhedra 
are explained in Ziegler (1995). Shapley (1974) introduced distinct labels as in (3.1) 
to visualize the LH algorithm. He labels subdivisions of the mixed strategy simplices, 
ignoring the payoff components in P and Q in (3.4). We prefer the polytope view using 
P and Q in (3.6), which simplifies the LH algorithm. Moreover, this view is useful for 
constructing games with many equilibria (von Stengel, 1999) that come close to the 
upper bound theorem for polytopes (Keiding, 1997; McMullen, 1970) , and for games 
with exponentially long LH paths (Savani and von Stengel, 2006). 

Algorithm 3.5 is suggested in (Kuhn, 1961; Mangasarian, 1964; Vorob’ev, 1958). 
The /rs method for vertex enumeration is due to (Avis, 2005; Avis and Fukuda, 1992). 
An equilibrium enumeration that (implicitly) alternates between P and Q is Audet 
et al. (2001). It has been implemented with integer pivoting (like /rs) by Rosenberg 
(2004). 

The LH algorithm is due to Lemke and Howson (1964). Shapley (1974) also shows 
that the endpoints of an LH path are equilibria of different index, which is an orientation 
defined by determinants, explored further in von Schemde (2005). A recent account of 
integer pivoting is Azulay and Pique (2001). Proposition 3.8 is due to Winkels (1979) 
and Jansen (1981). 

Extensive games with information sets are due to Kuhn (1953). Subgame perfection 
(Selten, 1975) is one of many refinements of Nash equilibria (von Damme, 1987). 
Main ideas of the sequence form have been discovered independently by (Koller and 
Megiddo, 1992; Romanovskii, 1962; von Stengel, 1996). Lemke’s algorithm (Lemke, 
1965) is applied to the sequence form in Koller et al. (1996); von Stengel et al. (2002). 

A recent paper, with further references, on algorithms for finding equilibria of games 
with more than two players, is Datta (2003). 


3.13 Discussion and Open Problems 


We have described the basic mathematical structure of Nash equilibria for two-player 
games, namely polyhedra and the complementarity condition of best responses. The 
resulting algorithms should simplify the analysis of larger games as used by applied 
game theorists. At present, existing software packages (Avis, 2005; Canty, 2003; McK- 
elvey et al., 2006) are prototypes that are not easy to use. Improved implementations 
should lead to more widespread use of the algorithms, and reveal which kinds of 
games practitioners are interested in. If the games are discretized versions of games 
in economic settings, enumerating all equilibria will soon hit the size barriers of these 
exponential algorithms. Then the LH algorithm may possibly be used to give an indi- 
cation if the game has only one Nash equilibrium, or Lemke’s method with varying 
starting point as in von Stengel et al. (2002). This should give practical evidence if 
these algorithms have usually good running times, as is widely believed, in contrast to 
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the extremal examples in Savani and Stengel (2006). An open theoretical question is if 
LH, or Lemke’s algorithm, has expected polynomial running time, as it is known for 
the simplex method, for suitable probabilistic assumptions on the instance data. 

The computational complexity of finding one Nash equilibrium of a two-player 
game, as discussed in Chapter 2, is open in the sense that not even a subexponential 
algorithm is known. Incremental or divide-and-conquer approaches, perhaps using the 
polyhedral structure, require a generalization of the equilibrium condition, because 
equilibria typically do not result from equilibria of games with fewer strategies. At 
the same time, such an approach must not maintain the entire set of Nash equilibria, 
because questions about that set (such as uniqueness, see Theorem 2.3) are typically 
NP-hard. 

Extensive games are a general model of dynamic games. The condition of perfect 
recall leads to canonical representations and algorithms, as described. Special types of 
extensive games, like repeated games and Bayesian games, are widely used in applied 
game theory. Finding equilibria of these models — where that task is difficult — should 
give a focus for further research. 


Bibliography 


C. Audet, P. Hansen, B. Jaumard, and G. Savard. Enumeration of all extreme equilibria of bimatrix 
games. SIAM J. Sci. Comput. 23, 323-338, 2001. 

D. Avis and K. Fukuda. A pivoting algorithm for convex hulls and vertex enumeration of arrangements 
and polyhedra. Disc. Comp. Geometry 8, 295-313, 1992. 

D. Avis. User’s Guide for Irs. Available at: http://cgm.cs.mcgill.ca/~avis, 2005. 

D.-O. Azulay and J.-F. Pique. A revised simplex method with integer Q-matrices. ACM Trans. Math. 
Software 27, 350-360, 2001. 

C. Bron and J. Kerbosch. Finding all cliques of an undirectred graph. Comm. ACM 16, 575-577, 
1973. 

M.J. Canty. Resolving Conflict with Mathematica: Algorithms for Two-Person Games. Academic 
Press, Amsterdam, 2003. 

R.S. Datta. Using computer algebra to compute Nash equilibria. Proc. 2003 Int. Symp. Symbolic and 
Algebraic Computation, ACM, 74—79, 2003. 

J. Dickhaut and T. Kaplan. A program for finding Nash equilibria. Math. J. 1:4, 87-93, 1991. 

M.J.M. Jansen. Maximal Nash subsets for bimatrix games. Naval Res. Logistics Q. 28, 147-152, 
1981. 

H. Keiding. On the maximal number of Nash equilibria in an n x n bimatrix game. Games Econ. 
Behav. 21, 148-160, 1997. 

D. Koller and N. Megiddo. The complexity of two-person zero-sum games in extensive form. Games 
Econ. Behav. 4, 528-552, 1992. 

D. Koller, N. Megiddo, and B. von Stengel. Efficient computation of equilibria for extensive two- 
person games. Games Econ. Behav. 14, 247-259, 1996. 

H.W. Kuhn. Extensive games and the problem of information. In: Contributions to the Theory of 
Games IT, eds. H. W. Kuhn and A. W. Tucker, Ann. Math. Studies 28, Princeton Univ. Press, 
Princeton, 193-216, 1953. 

H.W. Kuhn. An algorithm for equilibrium points in bimatrix games. Proc. National Academy of 
Sciences of the U.S.A. 47, 1657-1662, 1961. 

C.E. Lemke. Bimatrix equilibrium points and mathematical programming. Manag. Sci. 11, 681-689, 
1965. 


EXERCISES 77 


C.E. Lemke and J.T. Howson, Jr. Equilibrium points of bimatrix games. J. SIAM 12, 413-423, 1964. 

O.L. Mangasarian. Equilibrium points in bimatrix games. J. SIAM 12, 778-780, 1964. 

R.D. McKelvey, A. McLennan, and T.L. Turocy. Gambit: Software Tools for Game Theory. Available 
at: http://econweb.tamu.edu/gambit, 2006. 

P. McMullen. The maximum number of faces of a convex polytope. Mathematika 17, 179-184, 1970. 

J.F Nash. Non-cooperative games. Ann. Math. 54, 286-295, 1951. 

I.V. Romanovskii. Reduction of a game with complete memory to a matrix game. Soviet Math. 3, 
678-681, 1962. 

G.D. Rosenberg. Enumeration of all extreme equilibria of bimatrix games with integer pivoting 
and improved degeneracy check. CDAM Res. Rep. LSE-CDAM-2005-18, London School of 
Economics, 2004. 

R. Savani and B. von Stengel. Hard-to-solve bimatrix games. Econometrica 74, 397-429, 2006. 

R. Selten. Reexamination of the perfectness concept for equilibrium points in extensive games. Int. 
J. Game Theory 4, 22-55, 1975. 

L.S. Shapley. A note on the Lemke—Howson algorithm. Mathematical Programming Study 1 : Pivoting 
and Extensions, 175-189, 1974. 

E. van Damme. Stability and Perfection of Nash Equilibria. Springer, Berlin, 1987. 

A. von Schemde. Index and Stability in Bimatrix Games. Springer, Berlin, 2005. 

B. von Stengel. Efficient computation of behavior strategies. Games Econ. Behav. 14, 220-246, 1996. 

B. von Stengel. New maximal numbers of equilibria in bimatrix games. Disc. Comp. Geometry 21, 
557-568, 1999. 

B. von Stengel. Computing equilibria for two-person games. In: Handbook of Game Theory with 
Economic Applications, eds. R.J. Aumann and S. Hart, Elsevier, Amsterdam, 3, 1723-1759, 2002. 

B. von Stengel, A.H. van den Elzen, and A.J.J. Talman. Computing normal form perfect equilibria 
for extensive two-person games. Econometrica 70, 693-715, 2002. 

N.N. Vorob’ev. Equilibrium points in bimatrix games. Theory of Probability and its Applications 3, 
297-309, 1958. 

H.-M. Winkels. An algorithm to determine all equilibrium points of a bimatrix game. In: Game 
Theory and Related Topics, eds. O. Moeschlin and D. Pallaschke, North-Holland, Amsterdam, 
137-148, 1979. 

G.M. Ziegler. Lectures on Polytopes. Springer, New York, 1995. 


Exercises 


3.1 Prove the claim made after Algorithm 3.4 that nonunique solutions to the equations 
in that algorithm occur only for degenerate games. 

3.2 Show that in an equilibrium of a nondegenerate game, all pure best responses are 
played with positive probability. 

3.3. Give further details of the argument made after Algorithm 3.6 that LH terminates. 
A duplicate label of a vertex pair (x, y) can be dropped in either polytope. Interpret 
these two possibilities. 

3.4 Why is every pure strategy equilibrium found by LH for a suitable missing label? 

3.5 Show that the “projection” to polytope P, say, of a LH path from (x, y) to (x’, y’) 
in P x Qis also a path in P from x to x’. Hence, if (x, y) is an equilibrium, where 
can x be on that projected path? 


3.6 Verify the LH paths for the example (3.7). 
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3.7 


3.8 


3.9 


3.10 


3.11 


3.12 


3.13 
3.14 


EQUILIBRIUM COMPUTATION FOR TWO-PLAYER GAMES 


Apply integer pivoting to the system r + Ay =1 in the example, omitted after 
(3.13). 

After (3.14), what is the multiplier in the “suitable multiple of the pivot row”? Give 
formulas for the update rules of the tableau. 

Draw the polytope P for the game (3.18), and verify that the described naive use 
of LH fails. 

Implement the lexico-minimum ratio test for the system (3.19) using the data in 
(3.17); you need a suitable array to identify the order of the basic variables. 
Adapt a clique enumeration algorithm for graphs such as (Bron and Kerbosch, 
1973) to find all maximal Nash subsets (see at the end of Section 3.6). 

Consider an extensive game with a binary game tree of depth L (and thus 2! 
leaves), where the two players alternate and are informed about all past moves 
except for the last move of the other player (see von Stengel et al., 2002). How 
many reduced strategies do the players have? 

Prove Proposition 3.10, using (3.20), (3.21), and (3.23). Prove Proposition 3.11. 


Write down the LCP of Theorem 3.14 for the game in Figure 3.3. Find all its 
solutions, for example with a variant of Algorithm 3.4. 


CHAPTER 4 


Learning, Regret Minimization, 
and Equilibria 


Avrim Blum and Yishay Mansour 


Abstract 


Many situations involve repeatedly making decisions in an uncertain environment: for instance, 
deciding what route to drive to work each day, or repeated play of a game against an opponent with an 
unknown strategy. In this chapter we describe learning algorithms with strong guarantees for settings 
of this type, along with connections to game-theoretic equilibria when all players in a system are 
simultaneously adapting in such a manner. 

We begin by presenting algorithms for repeated play of a matrix game with the guarantee that 
against any opponent, they will perform nearly as well as the best fixed action in hindsight (also called 
the problem of combining expert advice or minimizing external regret). In a zero-sum game, such 
algorithms are guaranteed to approach or exceed the minimax value of the game, and even provide 
a simple proof of the minimax theorem. We then turn to algorithms that minimize an even stronger 
form of regret, known as internal or swap regret. We present a general reduction showing how to 
convert any algorithm for minimizing external regret to one that minimizes this stronger form of 
regret as well. Internal regret is important because when all players in a game minimize this stronger 
type of regret, the empirical distribution of play is known to converge to correlated equilibrium. 

The third part of this chapter explains a different reduction: how to convert from the full information 
setting in which the action chosen by the opponent is revealed after each time step, to the partial 
information (bandit) setting, where at each time step only the payoff of the selected action is observed 
(such as in routing), and still maintain a small external regret. 

Finally, we end by discussing routing games in the Wardrop model, where one can show that if 
all participants minimize their own external regret, then overall traffic is guaranteed to converge to 
an approximate Nash Equilibrium. This further motivates price-of-anarchy results. 


4.1 Introduction 


In this chapter we consider the problem of repeatedly making decisions in an uncertain 
environment. The basic setting is we have a space of N actions, such as what route to 
use to drive to work, or the rows of a matrix game like {rock, paper, scissors}. At each 
time step, the algorithm probabilistically chooses an action (say, selecting what route 
to take), the environment makes its “move” (setting the road congestions on that day), 
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and the algorithm then incurs the loss for its action chosen (how long its route took). 
The process then repeats the next day. What we would like are adaptive algorithms that 
can perform well in such settings, as well as to understand the dynamics of the system 
when there are multiple players, all adjusting their behavior in such a way. 

A key technique for analyzing problems of this sort is known as regret analysis. 
The motivation behind regret analysis can be viewed as the following: we design 
a sophisticated online algorithm that deals with various issues of uncertainty and 
decision making, and sell it to a client. Our algorithm runs for some time and incurs a 
certain loss. We would like to avoid the embarrassment that our client will come back 
to us and claim that in retrospect we could have incurred a much lower loss if we used 
his simple alternative policy 2. The regret of our online algorithm is the difference 
between the loss of our algorithm and the loss using z. 

Different notions of regret quantify differently what is considered to be a “simple” 
alternative policy. External regret, also called the problem of combining expert advice, 
compares performance to the best single action in retrospect. This implies that the 
simple alternative policy performs the same action in all time steps, which indeed is 
quite simple. Nonetheless, external regret provides a general methodology for devel- 
oping online algorithms whose performance matches that of an optimal static offline 
algorithm by modeling the possible static solutions as different actions. In the context 
of machine learning, algorithms with good external regret bounds can be powerful 
tools for achieving performance comparable to the optimal prediction rule from some 
large class of hypotheses. 

In Section 4.3 we describe several algorithms with particularly strong external regret 
bounds. We start with the very weak greedy algorithm, and build up to an algorithm 
whose loss is at most O(./T log N) greater than that of the best action, where T is 
the number of time steps. That is, the regret per time step drops as O(,/(log N)/T). 
In Section 4.4 we show that in a zero-sum game, such algorithms are guaranteed to 
approach or exceed the value of the game, and even yield a simple proof of the minimax 
theorem. 

A second category of alternative policies are those that consider the online sequence 
of actions and suggest a simple modification to it, such as “every time you bought IBM, 
you should have bought Microsoft instead.” While one can study very general classes 
of modification rules, the most common form, known as internal or swap regret, allows 
one to modify the online action sequence by changing every occurrence of a given 
action i by an alternative action j. (The distinction between internal and swap regret 
is that internal regret allows only one action to be replaced by another, whereas swap 
regret allows any mapping from {1,..., N}to {1,..., NM} and can be up to a factor NV 
larger). In Section 4.5 we present a simple way to efficiently convert any external regret 
minimizing algorithm into one that minimizes swap regret with only a factor N increase 
in the regret term. Using the results for external regret this achieves a swap regret bound 
of O(./TN log N). (Algorithms for swap regret have also been developed from first 
principles—see the Notes section of this chapter for references—but this procedure 
gives the best bounds known for efficient algorithms.) 

The importance of swap regret is due to its tight connection to correlated equilibria, 
defined in Chapter 1. In fact, one way to think of a correlated equilibrium is that it 
is a distribution Q over the joint action space such that every player would have zero 
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internal (or swap) regret when playing it. As we point out in Section 4.4, if each player 
can achieve swap regret €7, then the empirical distribution of the joint actions of the 
players will be an €-correlated equilibrium. 

We also describe how external regret results can be extended to the partial infor- 
mation model, also called the multiarmed bandit (MAB) problem. In this model, the 
online algorithm only gets to observe the loss of the action actually selected, and does 
not see the losses of the actions not chosen. For example, in the case of driving to 
work, you may only observe the travel time on the route you actually drive, and do not 
get to find out how long it would have taken had you chosen some alternative route. 
In Section 4.6 we present a general reduction, showing how to convert an algorithm 
with low external regret in the full information model to one for the partial information 
model (though the bounds produced are not the best known bounds for this problem). 

Notice that the route-choosing problem can be viewed as a general-sum game: your 
travel time depends on the choices of the other drivers as well. In Section 4.7 we 
discuss results showing that in the Wardrop model of infinitesimal agents (considered 
in Chapter 18), if each driver acts to minimize external regret, then traffic flow over 
time can be shown to approach an approximate Nash equilibrium. This serves to further 
motivate price-of-anarchy results in this context, since it means they apply to the case 
that participants are using well-motivated self-interested adaptive behavior. 

We remark that the results we present in this chapter are not always the strongest 
known, and the interested reader is referred to the recent book (Cesa-Bianchi and 
Lugosi, 2006) that gives a thorough coverage of many of the the topics in this chapter. 
See also the Notes section for further references. 


4.2 Model and Preliminaries 


We assume an adversarial online model where there are N available actions X = 
{1,..., N}. Ateach time step r, an online algorithm H selects a distribution p’ over the 
N actions. After that, the adversary selects aloss vector ¢’ € [0, 1], where £) € [0, Lis 
the loss of the i-th action at time t. In the full information model, the online algorithm H 
receives the loss vector ¢' and experiences a loss ¢', = )~_, p/€!. (This can be viewed 
as an expected loss when the online algorithm selects action i € X with probability 
p;.) In the partial information model, the online algorithm receives (€),, k’), where k’ 
is distributed according to p’, and €5, = £/, is its loss. The loss of the i-th action during 
the first T time steps is L? = )7/_, €/, and the loss of H is LT, = Yj, lly. 

The aim for the external regret setting is to design an online algorithm that will 
be able to approach the performance of the best algorithm from a given class of 
algorithms G; namely, to have a loss close to LG min = MiNgeg Ly Formally we would 
like to minimize the external regret Rg = L7, — LG min» and G is called the comparison 
class. The most studied comparison class G is the one that consists of all the single 
actions, i.e.,G = X. In this chapter we concentrate on this important comparison class, 
namely, we want the online algorithm’s loss to be close to L,, = min; L/, and let the 
external regret be R = Li, — LI... 

External regret uses a fixed comparison class G, but one can also envision a compar- 
ison class that depends on the online algorithm’s actions. We can consider modification 
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tules that modify the actions selected by the online algorithm, producing an alternative 
strategy which we will want to compete against. A modification rule F has as input the 
history and the current action selected by the online procedure and outputs a (possibly 
different) action. (We denote by F’ the function F at time f, including any dependency 
on the history.) Given a sequence of probability distributions p’ used by an online 
algorithm H, and a modification rule F’, we define a new sequence of probability dis- 
tributions f! = F'(p'), where f/ = >) j.-1(;)=; Pj. The loss of the modified sequence 
is Lip = >), >; ff ¢;. Note that at time ¢ the modification rule F shifts the probability 
that H assigned to action j to action F‘(j). This implies that the modification rule F 


generates a different distribution, as a function of the online algorithm’s distribution 
t 


p'. 

We will focus on the case of a finite set F of memoryless modification rules (they 
do not depend on history). Given a sequence of loss vectors, the regret of an online 
algorithm H with respect to the modification rules F is 


Rr = max {Li — Liz r}- 


Note that the external regret setting is equivalent to having a set F™ of N mod- 
ification rules F;, where F; always outputs action i. For internal regret, the set F™ 
consists of N(N — 1) modification rules F;,;, where F;,;(i) = j and F;,;(i’) =i' for 
i‘ £i. That is, the internal regret of H is 


Li —Lizr} = (¢; — £5) 
pax a Paal na [Do | 


A more general class of memoryless modification rules is swap regret defined by the 
class F*”, which includes all N™ functions F : {1,..., N} > {1,..., N}, where the 
function F swaps the current online action i with F (i) (which can he the same or a 
different action). That is, the swap regret of H is 


N T 
mes (Uh Hae) = Somme 9 oe} 


Note that since F% C F°” and F™ C F™, both external and internal regret are upper- 
bounded by swap regret. (See also Exercises 4.1 and 4.2.) 


4.3 External Regret Minimization 


Before describing the external regret results, we begin by pointing out that it is not 
possible to guarantee low regret with respect to the overall optimal sequence of de- 
cisions in hindsight, as is done in competitive analysis (Borodin and E]- Yaniv, 1998; 
Sleator and Tarjan, 1985). This will motivate why we will be concentrating on more 
restricted comparison classes. In particular, let G, be the set of all functions mapping 
times {1,..., 7} to actions X = {1,..., N}. 


Theorem 4.1 For any online algorithm H there exists a sequence of T loss 
vectors such that regret Rg,, is at least Tl — 1/N). 
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PROOF The sequence is simply as follows: at each time f, the action 7, of lowest 
probability p} gets a loss of 0, and all the other actions get a loss of 1. Since 
min,{p}} < 1/N, this means the loss of H in T time steps is at least T(1 — 1/N). 
On the other hand, there exists g € Gay, namely g(t) = i;, with a total loss of 0. 


The above proof shows that if we consider all possible functions, we have a very large 
regret. For the rest of the section we will use the comparison class G, = {g; :i € X}, 
where g; always selects action i. Namely, we compare the online algorithm to the best 
single action. 


4.3.1 Warmup: Greedy and Randomized-Greedy Algorithms 


In this section, for simplicity we will assume that all losses are either 0 or 1 (rather than 
a real number in [0, 1]), which will simplify notation and proofs, although everything 
presented can be easily extended to the general case. 

Our first attempt to develop a good regret minimization algorithm will be to consider 
the greedy algorithm. Recall that L} = an €;, namely the cumulative loss up to time 
t of action i. The Greedy algorithm at each time ¢ selects action x‘ = arg min;-y jee 
(if there are multiple actions with the same cumulative loss, it prefers the action with 


the lowest index). Formally: 


Greedy Algorithm 
Initially: eo 
Avtimes:~ “Let Lip mingy Le tyand Sf 


Let x‘ = min S'-!. 


Theorem 4.2. The Greedy algorithm, for any sequence of losses has 


EE ONT ON): 


Greedy — min 


PROOF At each time ft such that Greedy incurs a loss of 1 and L’,,, does 


min 
not increase, at least one action is removed from S’. This can occur at most 
N times before Lt,;, increases by 1. Therefore, Greedy incurs loss at most N 
between successive increments in L!,,,. More formally, this shows inductively 


that Lio, < N —|S')+N-L! 


Greedy — min’ 


The above guarantee on Greedy is quite weak, stating only that its loss is at most 
a factor of N larger than the loss of the best action. The following theorem shows 
that this weakness is shared by any deterministic online algorithm. (A deterministic 
algorithm concentrates its entire weight on a single action at each time step.) 


Theorem 4.3 For any deterministic algorithm D there exists a loss sequence 
for which Lt, = T and L?,. = |T/N]. 


min 
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Note that the above theorem implies that ie >N-L". 4(T mod N), which almost 


min 


matches the upper bound for Greedy (Theorem 4.2). 


PROOF Fix a deterministic online algorithm D and let x’ be the action it selects 
at time t. We will generate the loss sequence in the following way. At time f, let 
the loss of x’ be 1 and the loss of any other action be 0. This ensures that D incurs 
loss 1 at each time step, so Lf, = T. 

Since there are N different actions, there is some action that algorithm D has 
selected at most [7/N]| times. By construction, only the actions selected by D 
ever have a loss, so this implies that L7., < [T/N]. 


min — 


Theorem 4.3 motivates considering randomized algorithms. In particular, one weak- 
ness of the greedy algorithm was that it had a deterministic tie breaker. One can hope 
that if the online algorithm splits its weight between all the currently best actions, 
better performance could be achieved. Specifically, let Randomized Greedy (RG) be 
the procedure that assigns a uniform distribution over all those actions with minimum 
total loss so far. We now will show that this algorithm achieves a significant perfor- 
mance improvement: its loss is at most an O(log N) factor from the best action, rather 
than O(N). (This is similar to the analysis of the randomized marking algorithm in 
competitive analysis.) 


Randomized Greedy (RG) Algorithm 
Initially: p; =1/N fori € X. 
Attimet: Let £45} = minjey Li ', and S1 =i: LE) = 1}. 


Let p} = 1/|S‘!| fori € S‘~! and p} = 0 otherwise. 


Theorem 4.4 The Randomized Greedy (RG) algorithm, for any loss se- 
quence, has 

Lig < (nN)+(1+InN)Li,,. 
PROOF The proof follows from showing that the loss incurred by Randomized 
Greedy between successive increases in Li, is at most 1 + In N. Specifically, let 
t; denote the time step at which L/,,,, first reaches a loss of j, so we are interested 
in the loss of Randomized Greedy between time steps ¢; and t;,,. At time any ¢ 
we have | < |S‘| < N. Furthermore, if at time ¢ € (t;, t;1] the size of S‘ shrinks 
by k from some size n’ down to n’ — k, then the loss of the online algorithm 
RG is k/n’, since each such action has weight 1/n’. Finally, notice that we can 
upper bound k/n’ by 1/n’+ 1/(v’ — 1) +---+1/(@' —k + 1). Therefore, over 
the entire time-interval (t;, t;1], the loss of Randomized Greedy is at most: 


LN ALN = 14 YN = 2) 1 21, 


More formally, this shows inductively that Li, < (1/N+1/(N — 1)+---4 
1/(S'| + D) + +1) Lin. 
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4.3.2 Randomized Weighted Majority Algorithm 


Although Randomized Greedy achieved a significant performance gain compared 
to the Greedy algorithm, we still have a logarithmic ratio to the best action. Looking 
more closely at the proof, one can see that the losses are greatest when the sets S’ 
are small, since the online loss can be viewed as proportional to 1/|S*|. One way to 
overcome this weakness is to give some weight to actions which are currently “near 
best.” That is, we would like the probability mass on some action to decay gracefully 
with its distance to optimality. This is the idea of the Randomized Weighted Majority 
algorithm of Littlestone and Warmuth. 

Specifically, in the Randomized Weighted Majority algorithm, we give an action i 
whose total loss so far is L; a weight w; = (1 — n)“, and then choose probabilities 
proportional to the weights: p; = w;/ yaaa w ;. The parameter 7 will be set to optimize 
certain trade-offs but conceptually think of it as a small constant, say 0.01. In this 
section we will again assume losses in {0, 1} rather than [0, 1] because it allows for 
an especially intuitive interpretation of the proof (Theorem 4.5). We then relax this 
assumption in the next section (Theorem 4.6). 


Randomized Weighted Majority (RWM) Algorithm 

Initially: w; = land p} = 1/N, fori € X. 

Attimet: If ¢;7' = 1, let w! = w/'(1 — 7); else (¢)7' = 0) let wi = wi !. 
Let p} = w}/W', where W! = D0 -y wi. 


Algorithm RWM and Theorem 4.5 can be generalized to losses in [0, 1] by replacing the 
update rule with w! = w)'(1 — ny" (see Exercise 4.3). 


Theorem 4.5 For n < 1/2, the loss of Randomized Weighted Majority 
(RWM) on any sequence of binary {0, 1} losses satisfies 


T 7 InN 
Lim < + )Linin + a 


Setting n = min{./Un N)/T, 1/2} yields Lay < Lij, +2VT InN. 


(Note: The second part of the theorem assumes T is known in advance. If T is unknown, 
then a “guess and double” approach can be used to set 7 with just a constant-factor loss in 
regret. In fact, one can achieve the potentially better bound Lijy < Loin + 2VLimin InN 
by setting 7 = min{,/dn N)/Lmin, 1/2}.) 


PROOF The key to the proof is to consider the total weight W’. What we will 
show is that anytime the online algorithm has significant expected loss, the total 
weight must drop substantially. We will then combine this with the fact that 
Wt! > max; w/t! =(1- 7)Enin to achieve the desired bound. 

Specifically, let F’ = ()°,.,_, w})/W' denote the fraction of the weight W‘ 
that is on actions that experience a loss of | at time t; so, F’ equals the expected 
loss of algorithm RWM at time t. Now, each of the actions experiencing a loss 
of 1 has its weight multiplied by (1 — n) while the rest are unchanged. There- 
fore, Wt! = W' — nF'W' = W'(1 — nF’). In other words, the proportion of 
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the weight removed from the system at each time ¢ is exactly proportional to the 
expected loss of the online algorithm. Now, using the fact that W! = N and using 
our lower bound on W!*! we have 


T T 
(nym < wT! = W'T[a-nF) = N[[d-nF. 
t=1 t=1 
Taking logarithms, 
T 
Liin In — n) < (nN) + “In = nF’) 


t=1 


T 
< (InN) — )onF' 
t=1 
(Using the inequality In(1 — z) < —z) 


= (InN) — nln 


(by definition of F‘) 


Therefore, 
—Livin nl — n) , In) 


Ui 1 


In(N 
PA ery Fee 
1) 


LI 


RWM — 


(Using the inequality — In(1 — z) < +z? for0<z< 3) 


which completes the proof. 


4.3.3 Polynomial Weights Algorithm 


The Polynomial Weights (PW) algorithm is a natural extension of the RWM algo- 
rithm to losses in [0, 1] (or even to the case of both losses and gains, see Exercise 4.4) 
that maintains the same proof structure as that used for RWM and in addition performs 
especially well in the case of small losses. 


Polynomial Weights (PW) Algorithm 
Initially: w; = land p} = 1/N, fori € X. 
Attime rt: Let wf = w)'(1 — ni"!). 

Let pi = w}/W', where W' = 5° .-y wi. 


Notice that the only difference between PW and RWM is in the update step. In particular, 
it is no longer necessarily the case that an action of total loss L has weight (1 — n)“. 
However, what is maintained is the property that if the algorithm’s loss at time f is 
F', then exactly an 7F" fraction of the total weight is removed from the system. 
Specifically, from the update rule we have W't! = W! — )*. nwié) = W'(1 — nF‘) 
where F' = ()0; w/£;)/W’ is the loss of PW at time t. We can use this fact to prove the 
following. 


EXTERNAL REGRET MINIMIZATION 87 


Theorem 4.6 The Polynomial Weights (PW) algorithm, using n < 1/2, for 
any [0, 1]-valued loss sequence and for any k has, 


In(V) 
Loy < Ly +nOQe + i 


where Qj = yi (eL)*. Setting n = min{./(In N)/T, 1/2} and noting that Q} < 
T, we have Le < | Be + 2/T InN.! 


PROOF As noted above, we have W'*! = W'(1 — 7 F'), where F' is PW’s loss 
at time ft. So, as with the analysis of RWM, we have W7+! = NT]/_,(1 — nF") 
and therefore 


T T 
InW7*! = InN+) Ind —nF) < nN-—n)>) F! = InN - nL}, 


t=1 t=1 


Now for the lower bound, we have 


InW’**! > Inw,t! 


T 
= Soin (1 — ne) 
t=1 


(using the recursive definition of weights) 
T 


T 
> —> nt, — > (nti 
t=1 


t=1 


(using the inequality In(1 — z) > —z — 2? forO<z< 5) 
T 20T 
=—nl, —17Q;. 
Combining the upper and lower bounds on In W7*! we have: 


—nLi — 1’ Qi < nN — nL, 


which yields the theorem. 


4.3.4 Lower Bounds 


An obvious question is whether one can significantly improve the bound in Theorem 
4.6. We will show two simple results that imply that the regret bound is near optimal 
(see Exercise 4.5 for a better lower bound). The first result shows that one cannot hope 
to get sublinear regret when T is small compared to log N, and the second shows that 
one cannot hope to achieve regret o(/T) even when N = 2. 


Theorem 4.7 Consider T < log, N. There exists a stochastic generation of 
losses such that, for any online algorithm R1, we have ELL] = T/2 and yet 
Lia = 0. 


' Again, for simplicity we assume that the number of time steps T is given as a parameter to the algorithm; 
otherwise, one can use a “guess and double” method to set 7. 
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PROOF Consider the following sequence of losses. At time tf = 1, a random 
subset of NV /2 actions gets a loss of 0 and the rest gets a loss of 1. At time t = 2, 
a random subset of N/4 of the actions that had loss 0 at time t = 1 gets a loss of 
0, and the rest (including actions that had a loss of 1 at time 1) gets a loss of 1. 
This process repeats: at each time step, a random subset of half of the actions that 
have received loss 0 so far gets a loss of 0, while all the rest gets a loss of 1. Any 
online algorithm incurs an expected loss of 1/2 at each time step, because at each 
time step ¢ the expected fraction of probability mass p/ on actions that receive 
a loss of 0 is at most 1/2. Yet, for T < log, N there will always be some action 
with total loss of 0. 


Theorem 4.8 Consider N = 2. There exists a stochastic generation of losses 
such that, for any online algorithm R2, we have ry ea — te .| = Q(/T). 


PROOF Attimet, we flipa fair coin and set ¢’ = z; = (0, 1) with probability 1/2 
and €' = z) = (1,0) with probability 1/2. For any distribution p’ the expected 
loss at time t is exactly 1/2. Therefore any online algorithm R2 has expected loss 
of T/2. 

Given a sequence of T such losses, with T/2 + y losses z; and T/2 — y losses 
z2, we have T/2 — L7,, = |y|. It remains to lower bound E[|y|]. Note that the 
probability of y is (,. a ee) /27, which is upper bounded by O(1/ VT) (using a 
Sterling approximation). This implies that with a constant probability we have 
ly| = Q(/T), which completes the proof. 


4.4 Regret Minimization and Game Theory 


In this section we outline the connection between regret minimization and central 
concepts in game theory. We start by showing that in a two-player constant sum game, 
a player with external regret sublinear in T will have an average payoff that is at least 
the value of the game, minus a vanishing error term. For a general game, we will see that 
if all the players use procedures with sublinear swap-regret, then they will converge to 
an approximate correlated equilibrium. We also show that for a player who minimizes 
swap-regret, the frequency of playing dominated actions is vanishing. 


4.4.1 Game Theoretic Model 


We start with the standard definitions of a game (see also Chapter 1). A game G = 
(M, (X;), (s;)) has a finite set M of m players. Player i has a set X; of N actions and 
a loss function s; : X; x (x j;4;X;) — [0, 1] that maps the action of player i and the 
actions of the other players to a real number. (We have scaled losses to [0, 1].) The 
joint action space is X = x Xj. 

We consider a player i that plays a game G for T time steps using an online procedure 
ON. Attime step f, player i plays a distribution (mixed action) P’, while the other players 
play the joint distribution P';. We denote by ¢), the loss of player i at time f, ice., 
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E,.~ p:[s;(x')], and its cumulative loss is Lis = a oe > It is natural to define, for 
player i at time ¢, the loss vector as €' = (£/, ..., £4), where = Ex: ~pt [Si (xi, paar 
Namely, ei is the loss player i would have observed if at fine t it bad Blavee action 
x;. The cumulative loss of action x; € X; of player i is i= Ss pe yan = 
min; Li 


4.4.2 Constant Sum Games and External Regret Minimization 


A two-player constant sum game G = ({1, 2}, (X;), (s;)) has the property that for some 
constant c, for every x; € X, and x2 € X2 we have 5; (x1, x2) + 52(x1, x2) = c. It is well 
known that any constant sum game has a well-defined value (v1, v2) for the game, and 
player i € {1, 2} has a mixed strategy which guarantees that its expected loss is at most 
v;, regardless of the other player’s strategy. (See Owen, 1982, for more details.) In such 
games, external regret-minimization procedures provide the following guarantee. 


Theorem 4.9 Let G be a constant sum game with game value (v1, v2). If player 
i e€ {1,2} plays for T steps using a procedure ON with external regret R, then its 
average loss +L }y is at most v; + R/T. 


PROOF Let q be the mixed strategy corresponding to the observed frequencies 
of the actions player 2 has played; that is, g; = > ae P; ;/T, where P3 ; is the 
weight player 2 gives to action j at time t. By the theory of constant sum games, 
for any mixed strategy g of player 2, player 1 has some action x, € X,; such 
that E,,~g151(xg, X2)] < vy, (see Owen, 1982). This implies, in our setting, that if 
player | has always played action x;, then its loss would be at most v,; 7. Therefore 
jaar Ee < v,T. Now, using the fact that player 1 is playing a procedure ON 


min 


with Bena regret R, we have that LI, < L7,,+R<viT+R. 


min 


Thus, using a procedure with regret R = O(./T log N) as in Theorem 4.6 will 
guarantee average loss at most v; + O(./(log N)/T). 

In fact, we can use the existence of external regret minimization algo- 
rithms to prove the minimax theorem of two-player zero-sum games. For 
player 1, let unin =Minx,cx, MAX ze A(Xs) Ex,~z[51(41, X2)] and vty, = Maxx, 
Minzea(x,) Ex,~zl81(41, x2)]. That is, ujin is the best loss that player 1 can guaran- 
tee for itself if it is told the mixed action of player 2 in advance. Similarly, v,,,, is the 
best loss that player 1 can guarantee to itself if it has to go first in selecting a mixed 
action, and player 2’s action may then depend on it. The minimax on states that 
oe Ymnay: Since 51(x1, x2) = —s2(x1, x2) we can similarly define v2 =Uhax and 
Ve a # Unie 

In the following we give a proof of the minimax theorem based on the existence 
of external regret algorithms. Assume for contradiction that v1, = vi, + y for some 


y > 0 (it is easy to see that vj, > ujj,)- Consider both players playing a regret 


min — 


2 Alternatively, we could consider x as a random variable distributed according to P/, and similarly discuss the 
expected loss. We prefer the above presentation for consistency with the rest of the chapter. 
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minimization algorithm for T steps having external regret of at most R, such that 
R/T < y/2. Let Lon be the loss of player 1 and note that — Lon is the loss of player 
2. Let Li... be the cumulative loss of the best action of player i € {1, 2}. As before, 


min 
let g; be the mixed strategy corresponding to the observed frequencies of actions of 
player i € {1,2}. Then, L!../T < v},., since for L!,, we select the best action with 


min min? min 
respect to a specific mixed action, namely q>. Similarly, L2,,,/T < v2,,. The regret 


minimization algorithms guarantee for player 1 that Lon < Li, + R, and for player 
2 that —Lon < L2.. + R. Combining the inequalities we have: 


min 


Tvl, —R=-Trv.,—-R< —L2y, — BR < Lon < Lint R XT +P. 


max max min 
This implies that vj, —Umin <2R/T < y, which is a contradiction. Therefore, 
vi. = v!.. which establishes the minimax theorem. 


max min? 


4.4.3 Correlated Equilibrium and Swap Regret Minimization 


We first define the relevant modification rules and establish the connection between 
them and equilibrium notions. For x;, b), by € X;, let switch; (x;, b;, b2) be the follow- 
ing modification function of the action x, of player i: 


bo if x, => by 


switch; (x1, D1, bo) = . 
iQr1, b1, ba) i otherwise 


Given a modification function f for player i, we can measure the regret of player i 
with respect to f as the decrease in its loss, i.e., 


regret;(x, f) = s;(x) — 5;(f(%j), x-i)- 


For example, when we consider f(x,) = switch;(x,, b;, bz), for a fixed b;, by € Xj, 
then regret;(x, f) is measuring the regret player i has for playing action b; rather than 
bz, when the other players play x_,;. 

A correlated equilibrium is a distribution P over the joint action space with the 
following property. Imagine a correlating device draws a vector of actions x € X using 
distribution P over X, and gives player i the action x; from x. (Player i is not given 
any other information regarding x.) The probability distribution P is a correlated 
equilibrium if, for each player, it is a best response to play the suggested action, 
provided that the other players also do not deviate. (For a more detailed discussion of 
correlated equilibrium, see Chapter 1.) 


Definition 4.10 A joint probability distribution P over X is a correlated equi- 
librium if for every player i, and any actions b;, b> € X;, we have that 


E,.~ p[regret;(x, switch;(-, by, b2))] < 0. 


An equivalent definition that extends more naturally to the case of approximate 
equilibria is to say that rather than only switching between a pair of actions, we allow 
simultaneously replacing every action in X; with another action in X; (possibly the same 
action). A distribution P is a correlated equilibrium iff for any function F : X; > X; 
we have E,.~p[regret;(x, F)] < 0. 
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We now define an €-correlated equilibrium. An €-correlated equilibrium is a distri- 
bution P such that each player has in expectation at most an € incentive to deviate. 
Formally, 


Definition 4.11 A joint probability distribution P over X is an €-correlated 
equilibria if for every player i and for any function F; : X; — X;, we have 
E,~plregret;(x, F;)] <. 


The following theorem relates the empirical distribution of the actions performed 
by each player, their swap regret, and the distance to correlated equilibrium. 


Theorem 4.12 Let G = (M, (Xj), (s;)) be a game and assume that for T time 
steps every player follows a strategy that has swap regret of at most R. Then, 
the empirical distribution Q of the joint actions played by the players is an 
(R/T )-correlated equilibrium. 


PROOF The empirical distribution Q assigns to every P' a probability of 1/T. 
Fix a function F : X; — X; for player i. Since player i has swap regret at most 
R, we have Loy < Loy. ¢ + R, where L}y is the loss of player i. By definition of 
the regret function, we therefore have 
T T 
Lov Love =D) Ex~ else] — Y | Eve [si(F(1). x4) 
t=1 


t=1 
T 
= os Ey:~pr[regret,;(x', F)] = T - E,~glregret;(x, F)]. 


Therefore, for any function F; : X; — X; we have E,~o[regret;(x, F;)] < R/T. 


The above theorem states that the payoff of each player is its payoff in some 
approximate correlated equilibrium. In addition, it relates the swap regret to the distance 
from equilibrium. Note that if the average swap regret vanishes then the procedure 
converges, in the limit, to the set of correlated equilibria. 


4.4.4 Dominated Strategies 


We say that an action x; € X; is €-dominated by action x, € X; if for any x_; € X_j; we 
have s;(x;, xj) = € + 5;(xg, x_;). Similarly, action x; € X; is e-dominated by amixed 
action y € A(X;) if for any x_; € X_; we have s;(x;, x_;) = € + Ey,~y[si(Xa, x_i)]. 
Intuitively, a good learning algorithm ought to be able to learn not to play actions 
that are €-dominated by others, and in this section we show that indeed if player 7 plays 
a procedure with sublinear swap regret, then it will very rarely play dominated actions. 
More precisely, let action x; be €-dominated by action x, € X;. Using our notation, 
this implies that for any x_; we have that regret,(x, switch,(-, x;, xx,)) > €. Let De be 
the set of e-dominated actions of player i, and let w be the weight that player i puts on 
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actions in D,, averaged over time, i.e., w = + ae dV jeD. PI i. Player i’s swap regret 
is at least €wT (since we could replace each action in D, with the action that dominates 
it). So, if the player’s swap regret is R, then ewT < R. Therefore, the time-average 
weight that player i puts on the set of €-dominated actions is at most R/(€T), which 
tends to 0 if R is sublinear in 7. That is: 


Theorem 4.13 Consider a game G and a player i that uses a procedure of swap 
regret R for T time steps. Then the average weight that player i puts on the set of 
€-dominated actions is at most R/(€T). 


We remark that in general the property of having low external regret is not sufficient 
by itself to give such a guarantee, though the algorithms RWM and PW do indeed have 
such a guarantee (see Exercise 4.8). 


4.5 Generic Reduction from External to Swap Regret 


In this section we give a black-box reduction showing how any procedure A achieving 
good external regret can be used as a subroutine to achieve good swap regret as well. 
The high-level idea is as follows (see also Figure 4.1). We will instantiate N copies 
Aj,..., Ayn of the external-regret procedure. At each time step, these procedures will 
each give us a probability vector, which we will combine in a particular way to produce 
our own probability vector p. When we receive a loss vector £, we will partition it 
among the N procedures, giving procedure A; a fraction p; (p; is our probability mass 
on action i), so that A;’s belief about the loss of action j is }¢, p; ei, and matches the 
cost we would incur putting i’s probability mass on j. In the proof, procedure A; will, 
in some sense, be responsible for ensuring low regret of the i — j variety. The key to 
making this work is that we will be able to define the p’s so that the sum of the losses 
of the procedures A; on their own loss vectors matches our overall true loss. Recall the 
definition of an R external regret procedure. 


Figure 4.1. The structure of the swap regret reduction. 
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Definition 4.14 An R external regret procedure A guarantees that for any se- 
quence of T losses ¢' and for any action j € {1,..., N}, we have 


T 
P= St ey Ra eR 


t=1 


We assume we have N copies Aj,..., Ay of an R external regret procedure. We 
combine the N procedures to one master Praeeiue H as follows. At each time step f, 
each procedure A; outputs a distribution g/, where 4 ii is the fraction it asstet action 
j. We conipule a single distribution p’ such that P; = BiG, es That is, p’ = p'Q’, 
where p’ is our distribution and Q’ is the matrix of qi. .. (We can view p’ as a stationary 
distribution of the Markov Process defined by Q’, and it is well known that such a 
p’ exists and is efficiently computable.) For intuition into this choice of p’, notice 
that it implies we can consider action selection in two equivalent ways. The first is 
simply using the distribution p’ to select action j with probability P}. The second is to 
select procedure A; with probability p/ and then to use A; to select the action (which 
produces distribution p‘Q*). 

When the adversary returns the loss vector ¢', we return to each A; the loss vector 
pit’. So, procedure A; experiences loss (p}£') - g} = pj(q/ - £'). 

Since A; is an R external regret procedure, for any action j, we have, 


T 


T 
Yo pila -@) < do pie +R (4.1) 
t=1 


t=1 


If we sum the losses of the N procedures at a given time ft, we get )°; p} qi Ly= 
p' Q't', where p’ is the row vector of our distribution, Q’ is the matrix of qi j ,, and ¢' 
is viewed as a column vector. By design of p', we have p’Q' = p’. So, the ee of the 
perceived losses of the N procedures is equal to our actual loss p’¢’. 

Therefore, summing equation (4.1) over all N procedures, the left-hand side sums 
to L’,, where H is our master online procedure. Since the right-hand side of equation 
(4.1) holds for any j, we have that for any function F': {1,..., N}— {1,..., N}, 


N T 
Lin < oO Pier +NR=Lir + NR 
i=l t=1 
Therefore we have proven the following theorem. 
Theorem 4.15 Given an R external regret procedure, the master online pro- 


cedure H has the following guarantee. For every function F : {1,...,N}— 


{1,..., N}, 
Ly <Lart+NR, 


i.e., the swap regret of H is at most NR. 


Using Theorem 4.6, we can immediately derive the following corollary. 
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Corollary 4.16 There exists an online algorithm H such that for every function 
F:f{i,...,N}— {1,..., N}, we have that 


Ly <Lyrt+OWWyT logN), 
i.e., the swap regret of H is at most O(N./T log N). 


Remark. See Exercise 4.6 for an improvement to O(./NT log N). 


4.6 The Partial Information Model 


In this section we show, for external regret, a simple reduction from the partial infor- 
mation to the full information model.? The main difference between the two models is 
that in the full information model, the online procedure has access to the loss of every 
action. In the partial information model the online procedure receives as feedback only 
the loss of a single action, the action it performed. This very naturally leads to an ex- 
ploration versus exploitation trade-off in the partial information model, and essentially 
any online procedure will have to somehow explore the various actions and estimate 
their loss. 

The high-level idea of the reduction is as follows. Assume that the number of time 
steps T is given as a parameter. We will partition the T time steps into K blocks. The 
procedure will use the same distribution over actions in all the time steps of any given 
block, except it will also randomly sample each action once (the exploration part). 
The partial information procedure MAB will pass to the full information procedure FIB 
the vector of losses received from its exploration steps. The full information procedure 
FIB will then return a new distribution over actions. The main part of the proof will be 
to relate the loss of the full information procedure FIB on the loss sequence it observes 
to the loss of the partial information procedure MAB on the real loss sequence. 

We start by considering a full information procedure FIB that partitions the T time 
steps into K blocks, B!,..., BX, where B' = {Gi — 1)(T/K) +1,...,i(T/K)}, and 
uses the same distribution in all the time steps of a block. (For simplicity we assume 
that K divides 7.) Consider an Rx external regret minimization procedure FIB (over 
K time steps), which at the end of block i updates the distribution using the average 
loss vector, Le. c7 = Dye pr €'/|BT|. Let CK = *_, ct and CK, = min; CF. Since 
FIB has external regret at most Rx, this implies that the loss of FIB, over the loss 
sequence c’, is at most CX; + Rx. Since in every block B* the procedure FIB uses a 
single distribution p‘, its loss on the entire loss sequence is: 


LZ ve de ie des riot < LioK R 
he = DV = Ly pe = Tick + ax) 
t=1 


t=1 teBt 


At this point it is worth noting that if Rx = O(./K log N) the overall regret is 
O((T/VK)./log N), which is minimized at K = T, namely by having each block 


3 This reduction does not produce the best-known bounds for the partial information model (see, e.g., Auer et al., 
2002 for better bounds) but is particularly simple and generic. 
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be a single time step. However, we will have an additional loss associated with each 
block (due to the sampling) which will cause the optimization to require that K < T. 

The next step in developing the partial information procedure MAB is to use loss 
vectors that are not the “true average” but whose expectation is the same. More formally, 
the feedback to the full information procedure FIB will be a random variable vector 
é* such that for any action i we have E[¢;] = c} aunlaey: let ce — ea cy and 


t=l i 


Cx = min; C; K (Intuitively, we will generate the vector ¢’ using sampling within a 


block.) This ‘eels that for any block B* and any distribution p* we have 


N N 
Fi Sop Spec Sy pe =). pele | (4.2) 
i=l 


teBr 


That is, the loss of p* in B* is equal to its expected loss with respect to ¢* 

The full information procedure FIB observes the losses ¢", for t € {1,..., K}. 
However, since ¢* are random variables, the distribution p* is also a random variable 
that depends on the previous losses, i.e.,¢!, ..., €7~!. Still, with respect to any sequence 


of losses ¢*, we have that 
K 
ao? Oe Chin + Rx 


Since E[CK] = CX, this implies that 


E[ Certs] = E[Cnin] + Rx < Chin + Rx 
where we used the fact that E[min; cx ] < min; E[C ic ] and the expectation is over the 
choices of ¢*. 

Note that for any sequence of losses ¢!,...,¢*, both FIB and MAB will use the 
same sequence of distributions p!,..., p*. From (4.2) we have that in any block B* 
the expected loss of FIB and the loss of MAB are the same, assuming they both use the 
same distribution p*. This implies that 


E[Chs] = E[Chs] : 


We now need to show how to derive random variables ¢* with the desired property. 
This will be done by choosing randomly, for each action 7 and block B*, an exploration 
time t; € B*. (These do not need to be independent over the different actions, so can 
easily be done without collisions.) At time ¢; the procedure MAB will play action i (i.e., 
the probability vector with all probability mass oni). This implies that the feedback that 
it receives will be se and we will then set ¢/ to be ie This guarantees that E[¢/] = c;. 

So far we have ignored the loss in the exploration steps. Since the maximum loss is 
1, and there are N exploration steps in each of the K blocks, the total loss in all the 
exploration steps is at most NV K. Therefore we have 


E[Lias] = NK + (T/K)E[Coi 
< NK +(T/K)[C¥,, + Rx] 


min 


=L'. +NK+(T/K)Rx. 


min 
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By Theorem 4.6, there are external regret procedures that have regret Rx = 
O(./K log N). By setting K =(T/N)/*, for T >N, we have the following 
theorem. 


Theorem 4.17 Given an O(,/K log N) external regret procedure FIB (for K 
time steps), there is a partial information procedure MAB that guarantees 


Linn < Loi, + O(T?7N"? log N), 


min 


where T > N. 


4.7 On Convergence of Regret-Minimizing Strategies to Nash 
Equilibrium in Routing Games 


As mentioned earlier, one natural setting for regret-minimizing algorithms is online 
routing. For example, a person could use such algorithms to select which of N available 
routes to use to drive to work each morning in such a way that his performance will 
be nearly as good as the best fixed route in hindsight, even if traffic changes arbitrarily 
from day to day. In fact, even though in a graph G, the number of paths N between 
two nodes may be exponential in the size of G, there are a number of external-regret 
minimizing algorithms whose running time and regret bounds are polynomial in the 
graph size. Moreover, a number of extensions have shown how these algorithms can be 
applied even to the partial-information setting where only the cost of the path traversed 
is revealed to the algorithm. 

In this section we consider the game-theoretic properties of such algorithms in the 
Wardrop model of traffic flow. In this model, we have a directed network G = (V, E), 
and one unit flow of traffic (a large population of infinitesimal users that we view as 
having one unit of volume) wanting to travel between two distinguished nodes Ugtart 
and UVenq. (For simplicity, we are considering just the single-commodity version of the 
model.) We assume each edge e has a cost given by a /atency function £, that is some 
nondecreasing function of the amount of traffic flowing on edge e. In other words, the 
time to traverse each edge e is a function of the amount of congestion on that edge. In 
particular, given some flow f, where we use f, to denote the amount of flow ona given 
edge e, the cost of some path P is }°,-p €-(f-) and the average travel time of all users 
in the population can be written as )7, 2p Ce(fe) fe. A flow f is at Nash equilibrium if 
all flow-carrying paths P from gta, tO Venq are Minimum-latency paths given the flow 
f. 

Chapter 18 considers this model in much more detail, analyzing the relationship 
between latencies in Nash equilibrium flows and those in globally optimum flows 
(flows that minimize the total travel time averaged over all users). In this section we 
describe results showing that if the users in such a setting are adapting their paths 
from day to day using external-regret minimizing algorithms (or even if they just 
happen to experience low-regret, regardless of the specific algorithms used) then flow 
will approach Nash equilibrium. Note that a Nash equilibrium is precisely a set of 
static strategies that are all no-regret with respect to each other, so such a result seems 
natural; however, there are many simple games for which regret-minimizing algorithms 
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do not approach Nash equilibrium and can even perform much worse than any Nash 
equilibrium. 

Specifically, one can show that if each user has regret o(T), or even if just the average 
regret (averaged over the users) is o(7), then flow approaches Nash equilibrium in the 
sense that a 1 — € fraction of days t have the property that a 1 — € fraction of the 
users that day experience travel time at most € larger than the best path for that day, 
where € approaches 0 at a rate that depends polynomially on the size of the graph, 
the regret-bounds of the algorithms, and the maximum slope of any latency function. 
Note that this is a somewhat nonstandard notion of convergence to equilibrium: usually 
for an “e-approximate equilibrium” one requires that a// participants have at most € 
incentive to deviate. However, since low-regret algorithms are allowed to occasionally 
take long paths, and in fact algorithms in the MAB model must occasionally explore 
paths they have not tried in a long time (to avoid regret if the paths have become much 
better in the meantime), the multiple levels of hedging are actually necessary for a 
result of this kind. 

In this section we present just a special case of this result. Let P denote the set of 
all simple paths from Veta: tO Vena and let f’ denote the flow on day t. Let C(f) = 
eeer le( fe) fe denote the cost of a flow f. Note that C(f) is a weighted average of 
costs of paths in P and in fact is equal to the average cost of all users in the flow f. 
Define a flow f to be e-Nash if C(f) < € + minpep ep £-(f-); that is, the average 
incentive to deviate over all users is at most €. Let R(T) denote the average regret 
(averaged over users) up through day 7, so 


T T 
R(T) = DUD ee fe) fe — min DD) ee fe). 


t=1 ecE t=1 eeP 


Finally, let 7. denote the number of time steps T needed so that R(T) < €T for all 
T > T,.. For example the RWM and PW algorithms discussed in Section 4.3 achieve 
fe= Os log N) if we set 7 = €/2. Then we will show the following. 


Theorem 4.18 = Suppose the latency functions £, are linear. Then for T > Te, 
the average flow f = i(f! +.++.+ f7) is €-Nash. 

PROOF From the linearity of the latency functions, we have for all e, .(f-) = 
+ yo , le( 2). Since €.( f7) f2 is a convex function of the flow, this implies 

T 
res 1 
Le é = Le : a 
Gok = 5 2 (fe) Se 


Summing over all e, we have 


T 
i 1 
cif)<=) cy’ 
Gs 2 (f') 
= 2 
: ; a 
<e+ min r S X ts (f2) (by definition of 7.) 
t=! e€ 
=e+ min > Le( fe). (by linearity) 


eeP 
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This result shows the time-average flow is an approximate Nash equilibrium. This can 
then be used to prove that most of the f* must in fact be approximate Nash. The key idea 
here is that if the cost of any edge were to fluctuate wildly over time, then that would 
imply that most of the users of that edge experienced latency substantially greater than 
the edge’s average cost (because more users are using the edge when it is congested 
than when it is not congested), which in turn implies they experience substantial regret. 
These arguments can then be carried over to the case of general (nonlinear) latency 
functions. 


4.7.1 Current Research Directions 


In this section we sketch some current research directions with respect to regret mini- 
mization. 


Refined regret bounds: The regret bounds that we presented depend on the number of 
time steps 7, and are independent of the performance of the best action. Such bounds 
are also called zero-order bounds. More refined first-order bounds depend on the loss 
of the best action, and second-order bounds depend on the sum of squares of the losses 
(such as or in Theorem 4.6). An interesting open problem is to get an external regret 
that is proportional to the empirical variance of the best action. Another challenge is 
to reduce the prior information needed by the regret minimization algorithm. Ideally, 
it should be able to learn and adapt to parameters such as the maximum and minimum 
loss. See Cesa-Bianchi et al. (2005) for a detailed discussion of those issues. 


Large actions spaces: In this chapter we assumed the number of actions N is small 
enough to be able to list them all, and our algorithms work in time proportional to NV. 
However, in many settings N is exponential in the natural parameters of the problem. 
For example, the N actions might be all simple paths between two nodes s and f in 
an n-node graph, or all binary search trees on {1,...,}. Since the full information 
external regret bounds are only logarithmic in N, from the point of view of information, 
we can derive polynomial regret bounds. The challenge is whether in such settings we 
can produce computationally efficient algorithms. 

There have recently been several results able to handle broad classes of problems 
of this type. Kalai and Vempala (2003) give an efficient algorithm for any problem 
in which (a) the set X of actions can be viewed as a subset of R”, (b) the loss 
vectors £ are linear functions over R” (so the loss of action x is €- x), and (c) we 
can efficiently solve the offline optimization problem argmin,<s[x - €] for any given 
loss vector ¢. For instance, this setting can model the path and search-tree examples 
above.* Zinkevich (2003) extends this to convex loss functions with a projection oracle, 
and there is substantial interest in trying to broaden the class of settings that efficient 
regret-minimization algorithms can be applied to. 


4 The case of search trees has the additional issue that there is a rotation cost associated with using a different 
action (tree) at time ¢ + 1 than that used at time t. This is addressed in Kalai and Vempala (2003) as well. 
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Dynamics: It is also very interesting to analyze the dynamics of regret minimization 
algorithms. The classical example is that of swap regret: when all the players play 
swap regret-minimization algorithms, the empirical distribution converges to the set 
of correlated equilibria (Section 4.4). We also saw convergence in two-player zero- 
sum games to the minimax value of the game (Section 4.4), and convergence to 
Nash equilibrium in a Wardrop-model routing game (Section 4.7). Further results on 
convergence to equilibria in other settings would be of substantial interest. At a high 
level, understanding the dynamics of regret-minimization algorithms would allow us 
to better understand the strengths and weaknesses of using such procedures. For more 
information on learning in games, see the book by Fudenberg and Levine (1998). 


4.8 Notes 


Hannan (1957) was the first to develop algorithms with external regret sublinear in 
T. Later, motivated by machine learning settings in which N can be quite large, 
algorithms that furthermore have only a logarithmic dependence on N were developed 
by Littlestone and Warmuth (1994), and extended by a number of researchers (Cesa- 
Bianchi et al., 1997; Freund and Schapire, 1997, 1999). In particular, the Randomized 
Weighted Majority algorithm and Theorem 4.5 are from Littlestone and Warmuth 
(1994) and the Polynomial Weights algorithm and Theorem 4.6 is from Cesa-Bianchi 
et al. (2005). Computationally efficient algorithms for generic frameworks that model 
many settings in which N may be exponential in the natural problem description (such 
as considering all s-t paths in a graph or all binary search trees on n elements) were 
developed in Kalai and Vempala (2000) and Zinkevich (2003). 

The notion of internal regret and its connection to correlated equilibrium appear in 
Foster and Vohra (1998) and Hart and Mas-Colell (2000) and more general modification 
rules were considered in Lehrer (2003). A number of specific low internal regret 
algorithms were developed by a number of researcher (Blum and Mansour, 2005; 
Cesa-Bianchi and Lugosi, 2003; Foster and Vohra, 1997, 1998, 1999; Hart and Mas- 
Colell, 2003; Stoltz and Lugosi, 2005). The reduction in Section 4.5 from external to 
swap regret is from Blum and Mansour (2005). 

Algorithms with strong external regret bounds for the partial information model are 
given in Auer et al. (2002) , and algorithms with low internal regret appear in Blum and 
Mansour (2005) and Cesa-Bianchi et al. (2006). The reduction from full information 
to partial information in Section 4.6 is in the spirit of algorithms of Awerbuch and 
Mansour (2003) and Awerbuch and Kleinberg (2004). Extensions of the algorithm of 
Kalai and Vempala (2003) to the partial information setting appear in Awerbuch and 
Kleinberg (2004), Dani and Hayes (2006) and McMahan and Blum (2004). The results 
in Section 4.7 on approaching Nash equilibria in routing games are from Blum et al. 
(2006). 
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Exercises 


Show that swap regret is at most N times larger than internal regret. 

Show an example (even with N = 3) where the ratio between the external and swap 
regret is unbounded. 

Show that the RWM algorithm with update rule w! = w''(1 — 7)" ' achieves the same 
external regret bound as given in Theorem 4.6 for the PW algorithm, for losses in 
[0, 1]. 

Consider a setting where the payoffs are in the range [—1, +1], and the goal of the 
algorithm is to maximize its payoff. Derive a modified Pw algorithm whose external 
regret is O(,/QM,, log N + log N), where Ql. > Qf fork € Xj. 

Show a 2(,/T log N) lower bound on external regret, for the case that T > N. 
Improve the swap regret bound to O(,/NT log N). Hint: Use the observation that 
the sum of the losses of all the A; is bounded by 7. 

(Open Problem) Does there exist an Q(,/T N log N) lower bound for swap regret? 


Show that if a player plays algorithm RWM (or PW) then it gives e-dominated actions 
small weight. Also, show that there are cases in which the external regret of a player 
can be small, yet it gives ¢e-dominated actions high weight. 


CHAPTER 5 


Combinatorial Algorithms 
for Market Equilibria 


Vijay V. Vazirani 


Abstract 


Combinatorial polynomial time algorithms are presented for finding equilibrium prices and allocations 
for the linear utilities case of the Fisher and Arrow—Debreu models using the primal-dual schema and 
an auction-based approach, respectively. An intersting feature of the first algorithm is that it finds an 
optimal solution to a nonlinear convex program, the Eisenberg-Gale program. 

Resource allocation markets in Kelly’s model are also discussed and a strongly polynomial 
combinatorial algorithm is presented for one of them. 


5.1 Introduction 


Thinkers and philosophers have pondered over the notions of markets and money 
through the ages. The credit for initiating formal mathematical modeling and study 
of these notions is generally attributed to nineteenth-century economist Leon Walras 
(1874). The fact that Western economies are capitalistic had a lot to do with the over- 
whelming importance given to this study within mathematical economics — essentially, 
our most critical decision-making is relegated to pricing mechanisms. They largely de- 
termine the relative prices of goods and services, ensure that the economy is efficient, 
in that goods and services are made available to entities that produce items that are 
most in demand, and ensure a stable operation of the economy. 

A central tenet in pricing mechanisms is that prices be such that demand equals 
supply; that is, the economy should operate at equilibrium. It is not surprising therefore 
that perhaps the most celebrated theorem within general equilibrium theory, the Arrow— 
Debreu Theorem, establishes precisely the existence of such prices under a very general 
model of the economy. The First Welfare Theorem, which shows Pareto optimality of 
allocations obtained at equilibrium prices, provides important social justification for 
this theory. 

Although general equilibrium theory enjoyed the status of crown jewel within math- 
ematical economics, it suffers from a serious shortcoming — other than a few isolated 
results, some of which were real gems, e.g., Eisenberg and Gale (1959) and Scarf 
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(1973), it was essentially a nonalgorithmic theory. With the emergence of new markets 
on the Internet, which already form an important part of today’s economy and are pro- 
jected to grow considerably in the future, and the availability of massive computational 
power for running these markets in a distributed or centralized manner, the need for 
developing an algorithmic theory of markets and market equilibria is apparent. Such 
algorithms can also provide a valuable tool for understanding the repercussions of 
technological advances, new goods or changes to the tax structure on existing prices, 
production, and consumption. 

A good beginning has been made over the last 5 years within algorithmic game 
theory, starting with the work of Deng et al. (2002). However, considering the fact that 
markets were an active area of study for over a century within mathematical economics, 
it is safe to say that we have only scratched the surface of what should be a rich theory. 

Irving Fisher (see Brainard and Scarf, 2000) and Walras (1874) gave two fundamen- 
tal market models that were studied extensively within mathematical economics. The 
latter model is also called the exchange model or the Arrow—Debreu model (Arrow and 
Debreu, 1954). In this chapter we will present combinatorial algorithms for both these 
models for the case of linear utility functions. A second approach that has emerged for 
computing equilibria for these models is the efficient solution of convex programs, since 
equilibrium alloctions for both these models can be captured via convex programs; see 
Chapter 6 for this approach. 

Two techniques have been primarily used for obtaining combinatorial algorithms 
for these models — the primal-dual schema (Devanur et al. 2002) and an auction-based 
approach (Garg and Kapoor, 2004). We will present algorithms for the Fisher and 
Arrow—Debreu models, using the first and second techniques, respectively. 

An interesting aspect of the first algorithm was the extension of the primal-dual 
schema from its usual setting of combinatorially solving, either exactly or ap- 
proximately, linear programs, to exactly solving a nonlinear convex program (see 
Section 5.5). The latter program, due to Eisenberg and Gale (1959), captures 
equilibrium allocations for the linear case of Fisher’s model. Unlike complementary 
slackness conditions for linear programs, which involve either primal or dual variables, 
but not both, KKT conditions for a nonlinear convex program simultaneously involve 
both types of variables. The repercussions of this are apparent in the way the algorithm 
is structured. 

In a different context, that of modeling and understanding TCP congestion control,' 
Kelly (1997) defined a class of resource allocation markets and gave a convex pro- 
gram that captures equilibrium allocations for his model. Interestingly enough, Kelly’s 
program has the same structure as the Eisenberg—Gale program (see also Chapter 22). 


' In particular, Kelly’s object was to explain the unprecedented success of TCP, and its congestion avoidance 
protocol due to Jacobson (1988), which played a crucial role in the phenomenal growth of the Internet and the 
deployment of a myriad of diverse applications on it. Fairness is a key property desired of a congestion avoidance 
protocol and Jacobson’s protocol does seem to ensure fairness. Recent results show that if Jacobson’s protocol 
is run on the end-nodes and the Floyd—Jacobson protocol (Floyd and Jacobson, 1993) is run at buffer queues, 
in the limit, traffic flows converge to an optimal solution of Kelly’s convex program, i.e., they are equilibrium 
allocations, see Low and Lapsley (1999). Furthermore, Kelly used his convex programming formulation to 
prove that equilibrium allocations in his model satisfy proportional fairness (see Section 5.13), thereby giving 
a formal ratification of Jacobson’s protocol. 
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The flow market is of special significance within this framework. It consists of a 
network, with link capacities specified, and source — sink pairs of nodes, each with an 
initial endowment of money; allocations in this market are flows from each source to 
the corresponding sink. The problem is to find equilibrium flows and prices of edges 
(in the context of TCP, the latter can be viewed as drop rates at links). 

Kelly’s model attracted much theoretical study, partly with a view to designing 
next-generation protocols. Continuous time algorithms (though not having polynomial 
running time), for finding equilibrium flows in the flow market, were given by Kelly 
et al. (1998) (see also Wang et al., 2005, for more recent work along these lines). Soon 
after the appearance of Devanur et al. (2002), Kelly and Vazirani (2002) observed that 
Kelly’s model esentially generalizes Fisher’s linear case and stated, “Continuous time 
algorithms similar to TCP are known, but insights from discrete algorithms may be 
provocative.” 

With a view to answering this question, a systematic study of markets whose equilib- 
ria are captured by Eisenberg-Gale-type programs was undertaken by Jain and Vazirani 
(2006). In Section 5.14 we present, from this paper, a strongly polynomial algorithm 
for the special case of the flow market when there is one source and multiple sinks. 


5.2 Fisher’s Linear Case and the Eisenberg—Gale 
Convex Program 


Fisher’s linear case? is the following. Consider a market consisting of a set B of buyers 
and a set A of divisible goods. Assume |A| =n and |B| =n’. We are given for each 
buyer i the amount e; of money she possesses and for each good j the amount b; of 
this good. In addition, we are given the utility functions of the buyers. Our critical 
assumption is that these functions are linear. Let u;; denote the utility derived by 7 on 
obtaining a unit amount of good j. Thus if the buyer i is given x;; units of good j, for 
1 < j <n, then the happiness she derives is 


n 


) UjjXij- 


i=l 


Prices p1,..., Pn Of the goods are said to be market clearing prices if, after each buyer 
is assigned an optimal basket of goods relative to these prices, there is no surplus or 
deficiency of any of the goods. Our problem is to compute such prices in polynomial 
time. 

First observe that w.l.o.g. we may assume that each b; is unit — by scaling the u;;’s 
appropriately. The u;;’s and e;’s are in general rational; by scaling appropriately, they 
may be assumed to be integral. We will make the mild assumption that each good has 
a potential buyer; i.e., a buyer who derives nonzero utility from this good. Under this 
assumption, market clearing prices do exist. 

It turns out that equilibrium allocations for Fisher’s linear case are captured as op- 
timal solutions to a remarkable convex program, the Eisenberg—Gale convex program. 


? See Section 5.13 for a special case of this market and a simple polynomial time algorithm for it. 
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Before stating the program, it will be instructive to list considerations that would be 
useful in deriving such a program. 

Clearly, a convex program whose optimal solution is an equilibrium allocation must 
have as constraints the packing constraints on the x;;’s. Furthermore, its objective 
function, which attempts to maximize utilities derived, should satisfy the following: 


e If the utilities of any buyer are scaled by a constant, the optimal allocation remains 
unchanged. 

¢ Ifthe money of a buyer D is split among two new buyers whose utility functions are the 
same as that of b then sum of the optimal allocations of the new buyers should be an 
optimal allocation for b. 


The money weighted geometric mean of buyers’ utilities satisfies both these 


conditions: 
1/y; ej 
max (n «) ; 


icA 


Clearly, the following objective function is equivalent: 
max I] te 


Its log is used in the Eisenberg—Gale convex program: 


n’ 


maximize ) e; log u; 
i=l 


n 
subject to uj = os UjjXij Vie B 


= (5.1) 
Sage VjiEeA 

i=l 

xj; = 0 ViEB,VjEA 


where x;; is the amount of good j allocated to buyer 7. Interpret Lagrangian variables, 
say pj's, corresponding to the second set of conditions as prices of goods. By the 
Karush, Kuhn, Tucker (KKT) conditions, optimal solutions to x;;’s and p;’s must 
satisfy the following: 


Gi) Vi eA: pj = 0. 
Gi) Vj eA: pp >0> Vicari = 1. 
(iii) Vie Bj eA: < Patt 


ej 


(iv) Vie B.Vje Aix >0 > Ma testis 
J t 
From these conditions, one can derive that an optimal solution to convex program (5.1) 
must satisfy the market clearing conditions. 
The Eisenberg and Gale program also helps prove, in a very simple manner, the 


following basic properties of equilibria for the linear case of Fisher’s model. 
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Theorem 5.1. For the linear case of Fisher’s model: 

¢ If each good has a potential buyer, equilibrium exists. 

¢ The set of equilibrium allocations is convex. 

¢ Equilibrium utilities and prices are unique. 

¢ Ifall uj;’s and e;’s are rational, then equilibrium allocations and prices are also 


rational. Moreover, they can be written using polynomially many bits in the length 
of the instance. 


PROOF Corresponding to good j there is a buyer i such that u;; > 0. By the 
third KKT condition, 


ej uj; 


p= 
dj MijXij 


Now, by the second KKT condition, oe xij = 1. Hence, prices of all goods are 
positive and all goods are fully sold. 

The third and fourth conditions imply that if buyer i gets good j then j must 
be among the goods that give buyer i maximum utility per unit money spent at 
current prices. Hence each buyer gets only a bundle consisting of her most desired 
goods, i.e., an optimal bundle. 

The fourth condition is equivalent to 


> 0. 


CVU Xij 


Vics MiiXij 


Vi B,Vj A: = PjXij- 


Summing over all j gives 


ej DL MiiXii se 
Vi B: = PjXij- 
LjeaMiixii 


This implies 


Wie B: e= Dixy: 
Jj 


Hence the money of each buyer is fully spent. This completes the proof that 
market equilibrium exists. 

Since each equilibrium allocation is an optimal solution to the Eisenberg-Gale 
convex program, the set of equilibrium allocations must form a convex set. 

Since log is a strictly concave function, if there is more than one equilibrium, 
the utility derived by each buyer must be the same in all equilibria. This fact, 
together with the fourth condition, gives that the equilibrium prices are unique. 

Finally, we prove the fourth claim by showing that equilibrium allocations 
and prices are solutions to a system of linear equations. Let g; = 1/p; be anew 
variable corresponding to each good j and let k be the number of nonzero x;;’s in 
an equilibrium allocation. The system will consist of k + / equations over k +1 
unknowns, the latter being the n g;’s and the k the nonzero x;;’s. The equations are 
corresponding to each good j, the equality given by the second KKT condition, 
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and corresponding to each nonzero x;;, the equality given by the fourth KKT 
condition. 


5.3 Checking If Given Prices Are Equilibrium Prices 


Let p = (p1,..-, Pn) denote a vector of prices. Let us first devise an algorithm for 
answering the following question: Is p the equilibrium price vector, and if so, find 
equilibrium allocations for the buyers. 

At prices p, buyer i derives u;;/ pj; amount of utility per unit money spent on good j. 
Clearly, she will be happiest with goods that maximize this ratio. Define her bang per 
buck to be a; = max;{u;;/p;}. For each i € B, j € A, a; > ujj/p;, with equality 
holding only if 7 is i’s bang per buck good. If there are several goods maximizing 
this ratio, she is equally happy with any combination of these goods. This motivates 
defining the following bipartite graph, G. Its bipartitionis (A, B) and fori € B, j € A, 
(i, j) is an edge in G iff «; = u;;/p;. We will call this graph the equality subgraph and 
its edges the equality edges. 


5.3.1 The Network N(p) 


Any goods sold along the edges of the equality subgraph will make buyers happiest, 
relative to prices p. Computing the largest amount of goods that can be sold in this 
manner, without exceeding the budgets of buyers or the amount of goods available 
(assumed unit for each good), can be accomplished by computing max-flow in the 
following network (see Figure 5.1). Direct edges of G from A to B and assign a 
capacity of infinity to all these edges. Introduce source vertex s and a directed edge 
from s to each vertex j € A with acapacity of p;. Introduce sink vertex ¢ and a directed 
edge from each vertex i € B to t witha capacity of e;. The network is clearly a function 
of the prices p and will be denoted by N(p). 


A: goods B: buyers 


> 


1 
P\ 


P2 


> 
4 t 3 
infinite capacity edges 


Figure 5.1. The network N(p). 


THE PRIMAL-DUAL SCHEMA IN THE ENHANCED SETTING 109 


Corresponding to a feasible flow f in network N(p), let us define the allocation of 
goods to the buyers to be the following. If edge (j,i) from good j to buyer i carries 
flow f(j, i), then buyer i receives f(j,i)/pj; units of good j. 

The question posed above can be answered via one max-flow computation, as 
asserted in the following lemma. Its proof is straightforward and is omitted. 


Lemma 5.2 Prices pare equilibrium prices iff in the network N(p) the two cuts 
(s, AU B Ut) and (s UAUB,t) are min-cuts. If so, allocations corresponding 
to any max-flow in N are equilibrium allocations. 


5.4 Two Crucial Ingredients of the Algorithm 


The algorithm starts with very low prices that are guaranteed to be below the equilibrium 
prices for each good. The algorithm always works on the network N(:) w.r.t. the current 
prices p. W.r.t. the starting prices, buyers have surplus money left. The algorithm raises 
prices iteratively and reduces the surplus. When the surplus vanishes, it terminates; 
these prices are equilibrium prices. 

This algorithmic outline immediately raises two questions: 


¢ How do we ensure that the equilibrium price of no good is exceeded? 
¢ How do we ensure that the surplus money of buyers reduces fast enough that the 
algorithm terminates in polynomial time? 


The answers to these two questions lead to two crucial ingredients of the algorithm: 
tight sets and balanced flows. 


5.5 The Primal-Dual Schema in the Enhanced Setting 


We will use the notation setup in the previous sections to describe at a high level the 
new difficulties presented by the enhanced setting of convex programs and the manner 
in which the primal-dual schema is modified to obtain a combinatorial algorithm for 
solving the Eisenberg—Gale convex program. 

The fundamental difference between complementary slackness conditions for linear 
programs and KKT conditions for nonlinear convex programs is that whereas the 
former do not involve both primal and dual variables simultaneously in an equality 
constraint (obtained by assuming that one of the variables takes a nonzero value), the 
latter do. 

As described in the previous section, the algorithm will start with very low prices and 
keep increasing them greedily, i.e., the dual growth process is greedy. Indeed, all known 
primal-dual algorithms use a greedy dual growth process — with one exception, namely 
Edmonds’ algorithm for maximum weight matching in general graphs (Edmonds, 
1965). 

Now, the disadvantage of a greedy dual growth process is obvious — the fact that a 
raised dual is “bad,” in the sense that it “obstructs” other duals that could have led to a 
larger overall dual solution, may become clear only later in the run of the algorithm. In 
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view of this, the issue of using more sophisticated dual growth processes has received 
a lot of attention, especially in the context of approximation algorithms. The problem 
with such a process is that it will make primal objects go tight and loose and the 
number of such reversals will have to be upper bounded in the running time analysis. 
The impeccable combinatorial structure of matching supports such an accounting and 
in fact this leads to a strongly polynomial algorithm. However, thus far, all attempts at 
making such a scheme work out for other problems have failed. 

In our case, even though the dual growth process is greedy, because of the more 
complex nature of KKT conditions, edges in the equality subgraph appear and disappear 
as the algorithm proceeds. Hence, we are forced to carry out the difficult accounting 
process alluded to above for bounding the running time. 

We next point out which KKT conditions the algorithm enforces and which ones 
it relaxes, as well as the exact mechanism by which it satisfies the latter. Throughout 
the algorithm, we enforce the first two conditions listed in Section 5.2. As mentioned 
in Section 5.4, at any point in the algorithm, via a max-flow in the network N(p), all 
goods can be sold; however, buyers may have surplus money left over. W.r.t. a balanced 
flow in network N(p) (see Section 5.7 for a definition of such a flow), let m; be the 
money spent by buyer 7. Thus, buyer i’s surplus money is y; = e; — m;. We will relax 
the third and fourth KKT conditions to the following: 


7 See 
ovie BVjcA: < Dies anus 
Pj mj 


u;; + Uu;j;Xj; 
* ViE BVjiEA:xj,>0> ii _ Lijea imiy 
Pj mj 


Consider the following potential function: 
Day tyyt-- + Ye 


We will give a process by which this potential function decreases by an inverse poly- 
nomial fraction in polynomial time (in each phase, as detailed in Lemma 5.21). When 
® drops all the way to zero, all KKT conditions are exactly satisfied. 

Finally, there is a marked difference between the way this algorithm will satisfy 
KKT conditions and the way primal-dual algorithms for LP’s do. The latter satisfy 
complementary conditions in discrete steps, i.e., in each iteration, the algorithm sat- 
isfies at least one new condition. So, if each iteration can be implemented in strongly 
polynomial time, the entire algorithm has a similar running time. On the other hand, 
the algorithm for Fisher’s linear case satisfies KKT conditions continuously — as the 
algorithm proceeds, the KKT conditions corresponding to each buyer get satisfied to a 
greater extent. 

Observe that at the start of the algorithm, the value of @ is a function not just of 
the number of buyers and goods but of the length of the input (since it depends on 
the money possessed by buyers). Therefore, even though a phase of the algorithm can 
be implemented in strongly polynomial time, the running time of the entire algorithm 
is polynomial and not strongly polynomial. Indeed, obtaining a strongly polynomial 
algorithm for this problem remains a tantalizing open problem (see Section 5.15). 
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5.6 Tight Sets and the Invariant 


Let p denote the current prices within the run of the algorithm. For aset S C A of goods, 
let p(S) denote the total value of goods in S; this is simply the sum of current prices of 
goods in S. For a set T C B of buyers, let m(T) denote the total money possessed by 
the buyers in T; i.e., m(T) = )°;-7 ej. For S C A, define its neighborhood in N(p), 


T(S) = {j € B| di € S with(i, j) © N(p)}. 


Clearly, CS) is the set of buyers who are interested in goods in S at current prices. 

We will say that S is a tight set if the current value of S exactly equals the money 
possessed by buyers who are interested in goods in S; i.e., p(S) = m(1(S)). Under this 
circumstance, increasing prices of goods in S may lead to exceeding the equilibrium 
price of some good. Therefore, when a set of goods goes tight, the algorithm freezes 
the prices of all goods in S. As described in Section 5.7, when new edges enter the 
equality subgraph, the algorithm may unfreeze certain frozen goods and again start 
increasing their prices. 

A systematic way of ensuring that the equilibrium price of no good is exceeded is 
to ensure the following Invariant. 


Invariant: The prices p are such that the cut (s, AU B Ut) is a min-cut in N(p). 


Lemma 5.3 For given prices p, network N(p) satisfies the Invariant iff 


WS CA: p(S) < m(P(S)). 


PROOF The forward direction is trivial, since under max-flow (of value p(A)) 
every set S C A must be sending p(S) amount of flow to its neighborhood. 

Let us prove the reverse direction. Assume that (s U A; U By, Az U Bp Ut)isa 
min-cut in N(p), with Aj, A2 C A and B,, Bo C B (see Figure 5.2). The capacity 
of this cutis p(A2) + m(B,). Now, ['(A1) © Bj, since otherwise the cut will have 
infinite capacity. Moving A; and I(A)) to the ¢ side also results in a cut. By 
the condition stated in the Lemma, p(A,) < m(I'(A,)). Therefore, the capacity 
of this cut is no larger than the previous one and this is also a min-cut in N(p). 
Hence the Invariant holds. 


The Invariant ensures that, at current prices, all goods can be sold. The only even- 
tuality is that buyers may be left with surplus money. The algorithm raises prices 
systematically, thereby decreasing buyers’ surplus money. When (s U A U B, f) is also 
a min-cut in N(p), by Lemma 5.2, equilibrium has been attained. 


5.7 Balanced Flows 


Denote the current network, N(p), by simply N. We will assume that network N 
satisfies the Invariant; i.e., (s, A UU B Ut) is a min-cut in N. Given a feasible flow f in 
N, let R( f) denote the residual graph w.r.t. f. Define the surplus of buyer i, y;(N, f), 
to be the residual capacity of the edge (i, t) with respect to flow f in network N, 
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Figure 5.2. Min-cut in N(p). There are no edges from A; to Bp. 


1.e., e; minus the flow sent through the edge (i, t). The surplus vector is defined to be 
V(N, fo := (MW, f), WON, ff), ---, mCN, f)). Let ||v|] denote the /, norm of vector 
v. A balanced flow in network N is a flow that minimizes ||y(NV, f)||. A balanced flow 
must be a max-flow in N because augmenting a given flow can only lead to a decrease 
in the /) norm of the surplus vector. 


Lemma 5.4 All balanced flows in N have the same surplus vector. 


PROOF It is easy to see that if y; and y2 are the surplus vectors w.r.t flows fi 
and f, then (71 + y2)/2 is the surplus vector w.r.t the flow (fi + f2)/2. Since the 
set of feasible flows in N is a convex region, so is the set of all feasible surplus 
vectors. Since a balanced flow minimizes a strictly concave function of the surplus 
vector, the optimal surplus vector must be unique. 


The following property of balanced flows will be used critically in the algorithm. * 


Property 1: Ifyj,(N, f) < y,(N, f) then there is no path from node j to node i 
in R(f) — {s, t}. 


Theorem 5.5 A maximum-flow in N is balanced iff it satisfies Property 1. 


PROOF Let f be abalanced flow and let y(N, f) > yj(N, f) forsomei, j € B. 
Suppose, for the sake of contradiction, there is a path from j toi in R(f) — {s, t}. 

In N, the only edge out of 7 is the edge (j, t). Since the path in R(f) from j toi 
must start with a positive capacity edge which is different from edge (/, t), by flow 
conservation, the capacity of (f, j) must be positive in R(f). Since y;\(V, f) > 0, 
the edge (i, tf) has a positive capacity in R(f). Now, the edges (t, 7) and (i, t) 


3 Unlike the previous sections, in Section 5.7, j will denote a buyer. 
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Se 


Figure 5.3. The circulation in R(f) if Property 1 does not hold. 


concatenated with the path from j to i gives us a cycle with positive residual 
capacity in R(f) (see Figure 5.3). Sending a circulation of positive value along 
this cycle will result in another max-flow in which the residual capacity of j is 
slightly larger and that of 7 is slightly smaller; i.e., the flow is more balanced. This 
contradicts the fact that f is a balanced flow. 

To prove the other direction, first observe that the /, norm of the surplus vector 
of a max-flow f satisfying Property 1 is locally optimum w.r.t. changes in pairs 
of components of the surplus vector. This is so because any circulation in R(f) 
can only send flow from a high surplus buyer to a low surplus buyer resulting 
in a less balanced flow. Now, since /) norm is a strictly concave function, any 
locally optimal solution is also globally optimal. Hence, a max-flow / satisfying 
Property 1 must be a balanced flow. 


5.7.1 Finding a Balanced Flow 


We will show that the following algorithm, which uses a divide and conquer strategy, 
finds a balanced flow in the given network N in polynomial time. As stated above, we 
will assume that this network satisfies the Invariant, i.e., (s, AU B Uf) is a min-cut 
in N. 

Continuously reduce the capacities of all edges that go from B to f, other than those 
edges whose capacity becomes zero, until the capacity of the cut ({s} U A U B, {t}) 
becomes the same as the capacity of the cut ({s}, A U B U {t}). Let the resulting network 
be N’ and let f’ be a max-flow in N’. Find a maximal s — t min-cut in N’, say (S, T), 
with s € Sandt eT. 

Case 1: If T = {t} then find a max-flow in N’ and output it — this will be a balanced 
flow in N. 

Case 2: Otherwise, let N, and N> be the subnetworks of N induced by S U {rt} 
and T U {s}, respectively. (Observe that N, and N> inherit original capacities from 
N and not the reduced capacities from N’.) Let A; and B, be the subsets of A and 
B, respectively, induced by Nj. Similarly, let Az and Bz be the subsets of A and B, 
respectively, induced by N. Recursively find balanced flows, f; and f2, in Ni and No, 
respectively. Output the flow f = fi U fo — this will be a balanced flow in N. 
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Lemma 5.6 ff isa max-flow in N. 


PROOF In the first case, i.e., T = {t}, the algorithm outputs a max-flow in N’. 
This flow must saturate the cut ({s} U A U B, {t}). However, since the capacity 
of this cut in N’ is the same as the capacity of the cut ({s}, A U B U {t}), by the 
Invariant, this is also a max-flow in N. 

Next let us consider the second case. Since N, and N> are edge-disjoint net- 
works, f = f; U fp will bea feasible flow in NV. We will show that f must saturate 
all edges from s to A and therefore by the Invariant, it is a max-flow. 

Let g be amax-flow in N. Observe that N’, and hence N, cannot have any edges 
from A, to By. Therefore, all flow of g going to A; must flow via B,. Therefore, 
the restriction of g to N; saturates all edges from s to A; in N;. Therefore, so 
must f; since it is a max-flow in Nj. 

Let f’ be a max-flow in N’. Since (S, T) is a min-cut in N’, f’ must saturate 
all edges from s to A>. Furthermore, all flow of f’ going to Az must flow via Bo, 
i.e., the restriction of f’ to flow going through Az is a feasible flow in N. Since 
jo is a max-flow in No, it must also saturate all edges from s to Az. Hence f 
saturates all edges from s to A in N, and is therefore a max-flow. 


Lemma 5.7 ff is a balanced flow in network N. 


PROOF We will show, by induction on the depth of recursion, that the max-flow 
output by the algorithm is a balanced flow in N. In the base case, the algorithm 
terminates in the first case; i.e., 7 = {t}, the surplus vector is precisely the amounts 
subtracted from capacities of edges from B to t in going from N to N’. Clearly, 
this surplus vector makes components as equal as possible, thus minimizing its /y 
norm. 

Next assume that the algorithm terminates in the second case. By Lemma 5.6, f 
is a max-flow; we will show that it satisfies Property 1 and is therefore a balanced 
flow. By the induction hypothesis, f; and f> are balanced flows in N; and N>, 
respectively, and therefore Property 1 cannot be violated in these two networks. 

Let R be the residual graph of N w.r.t. flow f; we only need to show that 
paths in R that go from one part to the other do not violate Property 1. As already 
observed in the proof of Lemma 5.6, there are no edges from A, to Bz in N, and 
therefore there are no residual paths from j € B, toi € By. There may however 
be paths going from j € By toi € B, in R. We will show that for any two nodes 
i € B, and j € Bo, y,(N, f) < y;(N, f), thereby establishing Property 1. 

First observe that by the maximality of the min-cut found in N’, all nodes in By 
have surplus capacity > 0 w.r.t. flow f’ in N’ (all nodes having surplus zero must 
be in B,). Therefore, the same amount, say X, was subtracted from the capac ity 
of each edge (i, t),i € Bo, in going from network N to N’. We will show that 
yi(N, f) > X for eachi € Bo. A similar proof shows that y;(V, f) < X for each 
i € By, thereby establishing Property 1. 

Let L be the set of vertices in By having minimum surplus w.r.t. f. Let K be 
the set of vertices in A» that are reachable via an edge from L in R. We claim 
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that (K) = L, because otherwise, there will be a residual path from i € L to 
j € By — L, thereby violating Property 1. 

Let c(K) denote the sum of capacities of all edges from s to vertices of K. 
Observe that all these edges are saturated in f’ and this flow must leave via 
vertices of L. Let E; denote the set of edges going from L to t. Let c(L) and 
c'(L) denote the sum of capacities of all edges in E; in networks N and N’, 
respectively. By the argument given above, c’(L) > c(K). 

Since X is subtracted from all edges in E, in going from network N to N’, 
c(L) = c'(L) + |L|X. The total surplus of the edges in E,, w.r.t. flow f is 


c(L) — c(K) = c(L) + |L|X — c(K) > |L|X. 


Finally, since all edges in E;, have the same surplus, each has surplus > X. The 
lemma follows. 


Theorem 5.8 The above-stated algorithm computes a balanced flow in network 
N using at most n max-flow computations. 


PROOF Clearly, the number of goods in the biggest piece drops by at least 1 in 
each iteration. Therefore, the depth of recursion is at most n. Next, observe that 
N, and N> are vertex disjoint, other than s and ft, and therefore, the time needed 
to compute max-flows in them is bounded by the time needed to compute a max- 
flow in N. Hence, the total computational overhead is n max-flow computations. 
Finally, as shown in Lemma 5.7, the flow output by the algorithm is a balanced 
flow in NV. 


5.8 The Main Algorithm 


First we show how to initialize prices so the Invariant holds. The following two 
conditions guarantee this. 


¢ The initial prices are low enough prices that each buyer can afford all the goods. Fixing 
prices at 1/n suffices, since the goods together cost one unit and all e;’s are integral. 

e Each good j has an interested buyer, i.e., has an edge incident at it in the equality 
subgraph. Compute a; for each buyer i at the prices fixed in the previous step and 
compute the equality subgraph. If good 7 has no edge incident, reduce its price to 


i 
pj =max}—}. 
i Qj 

If the Invariant holds, it is easy to see that there is a unique maximal tight set S C A. 
Clearly, the prices of goods in the tight set cannot be increased without violating the 
Invariant. On the other hand, the algorithm can raise prices of all goods in A — S. 
However, we do not know any way of bounding the running time of any algorithm 
based on such an approach. In fact, it seems that any such algorithm can be forced 
to take a large number of steps in which it makes only very small progress toward 
decreasing the surplus of the buyers, thereby taking super polynomial time. 
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Instead, we will show how to use the notion of balanced flow to give a polynomial 
time algorithm. The idea is to always raise prices of those goods which are desired by 
buyers having a lot of surplus money. Eventually, when a subset of these goods goes 
tight, the surplus of some of these buyers vanishes, thus leading to substantial progress. 
Property 1 of balanced flows provides us with a powerful condition to ensure that even 
as the network N(p) changes because of changes in p, the algorithm can still keep 
working with a set of buyers having a large surplus. 

The iterative improvement steps follow the spirit of the primal-dual schema: The 
“primal” variables are the flows in the edges of N(p) and the “dual” variables are 
the current prices. The current flow suggests how to improve the prices and vice 
versa. 

A run of the algorithm is partitioned into phases, each phase ends with a new set 
going tight. Each phase is partitioned into iterations that are defined below. 

A phase starts with computation of a balanced flow, say /, in the current network, 
N(p). If the algorithm of Section 5.7 for finding a balanced flow terminates in 
Case 1, then by Lemma 5.2 the current prices and allocations are equilibrium prices 
and allocations and the algorithm halts. Otherwise, let 5 be the maximum surplus of 
buyers w.r.t. f. Initialize J to be the set of buyers having surplus 6. Let J be the set of 
goods that have edges to J in N(p). The network induced by J U J is called the active 
subgraph. 

At this point, we are ready to raise prices of goods in J. However, we would like to 
do this in such a way that for each buyer i € J, the set of goods she likes best, which 
are all in J, remains unchanged as prices increase. This can be accomplished by raising 
prices of goods in J in such a way that the ratio of any two prices remains unchanged. 
The rest of the algorithm for a phase is as follows. 

Step ©: Multiply the current prices of all goods in J by variable x, initialize x to 1 
and raise x continuously until one of the following two events happens. Observe that 
as soon as x > 1, buyers in B — / are no longer interested in goods in J and all such 
edges can be dropped from the equality subgraph and JN. 


¢ Event 1: If a subset § C J goes tight, the current phase terminates and the algorithm 
starts with the next phase. 

¢ Event 2: As prices of goods in J keep increasing, goods in A — J become more and 
more desirable for buyers in /. If as a result an edge (7, j), with i ¢ J and j e A—J, 
enters the equality subgraph (see Figure 5.4). add directed edge (j, i) to network N(p) 
and compute a balanced flow, say f, in the current network, N(p). If the balanced 
flow algorithm terminates in Case 1, halt and output the current prices and allocations. 
Otherwise, let R be the residual graph corresponding to f. Determine the set of all 
buyers that have residual paths to buyers in the current set J (clearly, this set will contain 
all buyers in 7). Update the new set J to be this set. Update J to be the set of goods that 
have edges to J in N(p). Go to Step o. 


To complete the algorithm, we simply need to compute the smallest values of x at 
which Event | and Event 2 happen, and consider only the smaller of these. For Event 
2, this is straightforward. We give an algorithm for Event 1| in the next section. 
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Figure 5.4. If Event 2 happens, edge (/, i) is added to N(p). 


5.9 Finding Tight Sets 


Let p denote the current price vector (i.e., at x = 1). We first present a lemma that 
describes how the min-cut changes in N(x - p) as x increases. Throughout this section, 
we will use the function m to denote money w.r.t. prices p. W.l.o.g. assume that w.r.t. 
prices p the tight set in G is empty (since we can always restrict attention to the active 
subgraph, for the purposes of finding the next tight set). Define 


: _ m(I(S)) 
= min ——— 


; 
GASCA m(S) 


’ 


the value of x at which a nonempty set goes tight. Let S* denote the tight set at 
prices x* - p. If(s UA; U By, Az U By Ut) is acut in the network, we will assume that 
A,, Az C A and By, By C B. 


Lemma 5.9 W.r-t. prices x - p: 
© ifx <x* then(s, AUB Ut) is amin-cut. 


° if x >x* then (s, AU BUF) is not a min-cut. Moreover, if (s UA, U By, Az U 
By Ut) is a min-cut in N(x - p) then S* C Aj. 


PROOF Suppose x < x*. By definition of x*, 
VS CA: x-m(S) < m((S)). 


Therefore by Lemma 5.3, w.r.t. prices x - p, the Invariant holds. Hence (s, A U 
B Ut) is amin-cut. 

Next suppose that x > x*. Since x -m(S*) > x* - m(S*) = m(T(S*)), wrt. 
prices x - p, the cut (s U S* UT(S*), t) has strictly smaller capacity than the cut 
(s VAUB,t). Therefore the latter cannot be a min-cut. 

Now consider the min-cut (s U A; U B;, Az U Bp Ut). Let S* MN Az = Sp and 
S* — So = S. Suppose Sz 4 J. Clearly P(S;) C By, (otherwise the cut will have 
infinite capacity). If m((S2) MN Bo) < x - m(S2), then by moving Sz and I(S2) to 
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the s side of this cut, we can get a smaller cut, contradicting the minimality of the 
cut picked. In particular, m((S*) NM Bo) < m(1(S*)) = x* - m(S*) < x -m(S*). 
Therefore S; 4 S*, and hence, S$, 4 @. Furthermore, 


m(T'(S2) M By) = x + m(Sz) > x*m(S2). 
On the other hand, 
m(T'(S2) M1 Bs) + m(T(S1)) < x*(m(S2) + m(S))). 
The two imply that 
m(I(S))) 
m(S}) 


contradicting the definition of x*. Hence $2 = @ and S* C Aj}. 


* 
’ 


Lemma 5.10 Let x =m(B)/m(A) and suppose that x > x*. If (sUALU 
B,, Az U Bp Ut) be a min-cut in N(x - p) then A, must be a proper subset of 
A. 


PROOF If A; = A, then B; = B (otherwise this cut has oo capacity), and (s U 
A U B, t)is amin-cut. But for the chosen value of x, this cut has the same capacity 
as(s, AU B Ut). Since x > x*, the latter is not a min-cut by Lemma 5.9. Hence, 
A, is a proper subset of A. 


Lemma 5.11 x* and S* can be found using n max-flow computations. 


PROOF Let x = m(B)/m(A). Clearly, x > x*. If (s, AU B Uf) is a min-cut in 
N(x - p), then by Lemma 5.9, x* = x. If so, S* = A. 

Otherwise, let (s U A; U B,, Az U Bz Ut) be a min-cut in N(x - p). By Lem- 
mas 5.9 and 5.10, S* C A; C A. Therefore, it is sufficient to recurse on the smaller 
graph (Ai, '(A1)). 


5.10 Running Time of the Algorithm 
Let U = maxjeg, jea{uij} and let A = nU”. 


Lemma 5.12 At the termination of a phase, the prices of goods in the newly 
tight set must be rational numbers with denominator < A. 


PROOF Let S be the newly tight set and consider the equality subgraph induced 
on the bipartition (S$, [(S)). Assume w.l.o.g. that this graph is connected (other- 
wise we prove the lemma for each connected component of this graph). Let j € S. 
Pick a subgraph in which j can reach all other vertices j’ € S. Clearly, at most 
2|S| < 2n edges suffice. If j reaches j’ with a path of length 2/, then p;, = ap;/b 
where a and b are products of / utility parameters (uj;,’s) each. Since alternate 
edges of this path contribute to a and b, we can partition the u;,’s in this subgraph 


RUNNING TIME OF THE ALGORITHM 119 


into two sets such that a and b use u;;’s from distinct sets. These considerations 
lead easily to showing that m(S) = pjc/d where c < A. Now, 


pj =m((S))d/c, 


hence proving the lemma. 


Lemma 5.13 Consider two phases P and P’, not necessarily consecutive, such 
that good j lies in the newly tight sets at the end of P as well as P'. Then the 
increase in the price of j, going from P to P', is > 1/A°. 


PROOF Let the prices of j at the end of P and P’ be p/g and r/s, respectively. 
Clearly, r/s > p/q. By Lemma 5.12, gq < A andr < A. Therefore the increase 
in price of j, 

BGR ag oh 
s qh? 


Within a phase, we will call each occurrence of Events | and 2 an iteration. 
Lemma 5.14 = The total number of iterations in a phase is bounded by n. 


PROOF After an iteration due to Event 2, at least one new good must move into 
the active subgraph. Since there is at least one good in the active subgraph at the 
start of a phase, the total number of iterations in a phase due to Event 2 is at 
most n — 1. Finally, the last iteration in each phase is due to Event 1. The lemma 
follows. 


Lemma 5.15 [f f and f* are respectively a feasible and a balanced flow 
in N(p) such that y;(p, f*) = vi(p, f) — 6, for some i € B and & > 0, then 
lv(p. AYP? < lye, AIP? — 8. 


PROOF Suppose we start with f and get anew flow f’ by decreasing the surplus 
of i by 6, and increasing the surpluses of some other buyers in the process. We 
show that this already decreases the /, norm of the surplus vector by 5? and so the 
lemma follows. 

Consider the flow f* — f. Decompose this flow into flow paths and circula- 
tions. Among these, augment f with only those that go through the edge (i, r), to 
get f’. These are either paths that go from s to i to f, or circulations that go from 
i to t to some i; and back to i. Then y;(f’) = y;(f*) = y;(f) — 6 and for a set 
of vertices i, i2,..., ix, we have y;,(f’) = y;,(f) + 4), s.t. 6, 62,..., d¢ > Oand 
a 6; < 6. Moreover, for all /, there is a path from i to i; in R(p, f*). Since f* 
is balanced, and satisfies Property 1, y;(f’) = vi(f*) = vi, (f*) = vi, (f°. 

By Lemma 5.16, |ly(p, fl? < lly(p, II? — 6? and since f* is a balanced 
flow in N(p), ly(p. FOI? < llv(p, FOI? 
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Lemma 5.16 /f a>b; >0,i=1,2,...,n and 6 > Y¥_, 8; where 5,6; > 
0,7 =1,2,...,n, then 


II(@, bi, B2,..-, Dy DI? S @ +8, B1 — 81, by — 89, ..., Dp — Sn) ||? — 8. 


PROOF 


(a +6) + 0 (bj — 8) — a — 1b} > 8 + 2a (5-¥s] 2 
i=l 


i=1 


Let No denote the network at the beginning of a phase. Assume that the phase 
consists of k iterations, and that N, denotes the network at the end of iteration ¢. Let f; 
be a balanced flow in N;,0 <t <k. 


Lemma 5.17 __f; is a feasible flow in N,4,, forO <t <k. 


PROOF The lemma follows from the fact that each of the two actions, raising 
the prices of goods in J or adding an edge as required in Event 2, can only lead 
to a network that supports an augmented max-flow. 


Corollary 5.18 — || y(N;)|| is monotonically decreasing with t. 


Let 5, denote the minimum surplus of a buyer in the active subgraph in network N,, 
forO <t <k; clearly, 59 = 6. 


Lemma 5.19 /f 5,_; — 5; > 0 then there exists ani € H such that y;(p;_\) — 
vil P,) > b;-1 — 6;. 


PROOF Consider the residual network R(p,, f) corresponding to the balanced 
flow computed at the end of iteration t. By definition of H,, every vertex v € 
Hi, \ H;~1 canreacha vertex i € H;_; in R(p,, f) and therefore, by Theorem 5.5, 
Vo(p;) = vi(p;). This means that minimum surplus 6, is achieved by a vertex i 
in H;_;. Hence, the surplus of vertex i is decreased by at least 5,1 — 6, during 
iteration f. 


Lemma 5.20 [f 5,41 < 4; then \|y(N)II? — lly = (6; — 641)”, for 0 < 
t<k. 


PROOF By Lemma 5.19, if 5,,; < 4, then there is a buyer i whose surplus drops 
by 6, — 46,41 in going from f; to f;4;. By Lemmas 5.15 and 5.17, we get the 
desired conclusion. 
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Lemma 5.21. Ina phase, the square of the |, norm of the surplus vector drops 
by a factor of 


PROOF We will first prove that 


$2 
ly(No)II? — lly? = 


Observe that the left-hand side can be written as a telescoping sum in which 
each term is of the form ||y(N,)||? — ||v(N+1)||?. By Corollary 5.18, each of these 
terms is positive. Consider only those terms in which the difference 6, — 641 > 
0. Their sum is minimized when all these differences are equal. Now using 
Lemma 5.20 and the fact that dy) = 6 and 6, = 0, we get that 


2 a © 
ly Nolo — lly Nad = rE 
By Lemma 5.14, k < n, giving the desired inequality. 
The above-stated inequality and the fact that ||y(No)||? < nd? gives us 


2 2 1 
Iya” < llyWoll (1 —-— = )- 
n 


The lemma follows. 


Theorem 5.22. The algorithm finds equilibrium prices and allocations for linear 
utility functions in Fisher’s model using 


O(n*(logn +n log U + log M)) 


max-flow computations. 


PROOF By Lemma5.21, the square of the surplus vector drops by a factor of half 
after O(n”) phases. At the start of the algorithm, the square of the surplus vector is 
at most M?. Once its value drops below 1/A‘*, the algorithm achieves equilibrium 
prices. This follows from Lemmas 5.12 and 5.13 Therefore the number of phases 
is 


O(n? log(A*M’) = O(n*(logn +n log U + log M)). 


By Lemma 5.14 each phase consists of n iterations and by Lemma 5.11 each 
iteration requires n max-flow computations. The theorem follows. 


5.11 The Linear Case of the Arrow—Debreu Model 


The Arrow—Debreu model is also known as the Walrasian model or the exchange 
model, and it generalizes Fisher’s model. Consider a market consisting of a set A of 
agents and a set G of goods; assume |G| = n and | A| = m. Each agent i comes to the 
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market with an initial endowment of goods, e; = (1, €;2,..., €in). We may assume 
w.l.o.g. that the total amount of each good is unit, i.e., for 1 < j <n, ee ej = 1. 
Each agent has linear utilities for these goods. The utility of agent i on deriving x;; 
amount of good j, for 1 < j <n, is Se UjjXij- 

The problem is to find prices p = (pi, ..., Pm) for the goods so that if each agent 
sells her initial endowment at these prices and buys her optimal bundle, the market 
clears; i.e., there is no deficiency or surplus of any good. An agent may have more than 
one optimal bundle; we will assume that we are free to give each agent any optiaml 
bundle to meet the market clearing condition. 

Observe that a Fisher market with linear utilities, n goods, and m buyers reduces 
to an Arrow—Debreu market with linear utilities, n + 1 goods and m+ 1 agents as 
follows. In the Arrow—Debreu market, we will assume that money is the n + 1’st good, 
the first m agents correspond to the m buyers whose initial endowment is the money 
they come to the market with and the m + 1’st agent’s initial endowment is all n goods. 
The first m agents have utilities for goods, as given by the Fisher market and no utility 
for money, whereas the m + 1’st agent has utility for money only. 

We define the following terms for the algorithm below. For agent i, leta; = ae 1 ij: 
Let dmin be the minimum among a;, 1 <i < m. Denote by pmax the maximum price 
assigned to a good by the algorithm. Denote by Umin and Umax the minimum and 
maximum values of u;; over all agents i and goods j. 


5.12 An Auction-Based Algorithm 


We will present an auction-based algorithm for the linear case of the Arrow—Debreu 
model. It will find an approximate equilibrium in the following sense. For any fixed 
€ > 0, it will find prices p for the goods such that the market clears and each agent 
gets a bundle of goods that provides her utility at least (1 — €)? times the utility of her 
optimal bundle. 

The algorithm initializes the price of each good to be unit, computes the worth of 
the initial endowment of each agent, and gives this money to each agent. All goods are 
initially fully unsold. 

We will denote by p = (p), Po, .--, Pn) the vector of prices of goods at any point in 
the algorithm. As p changes, the algorithm recomputes the value of each agent’s initial 
endowment and updates her money accordingly. Clearly, at the start of the algorithm, 
the total surplus (unspent) money of all agents is n. 

At any point in the algorithm, a part of good j is sold at price p; and part of it is 
sold at (1 + €)p;. The run of the algorithm is partitioned into iterations. Each iteration 
terminates when the price of some good is raised by a factor of (1 + €). Each iteration 
is further partitioned into rounds. In a round, the algorithm considers agents one by one 
in some arbitrary but fixed order, say 1, 2,..., m. If the agent being considered, i, has 
no surplus money, the algorithm moves to the next agent. Otherwise, it finds i’s optimal 
good, in terms of bang per buck, at current prices; say, it is good j. It then proceeds 
to execute the operation of outbid. This entails buying back good j from agents who 
have it at price p; and selling it to 7 at price pj(1 + €). This process can end in one of 
two ways: 
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e Agent i’s surplus money is exhausted. If so, the algorithm moves on to the next agent. 

¢ No agent has good j at price p; anymore. If so, it raises the price of good j to p;(1 + €) 
by setting p; to p;(1 +). The current iteration terminates and agents’ moneys are 
updated because of this price rise. 


When the current round comes to an end, the algorithm checks if the total surplus 
money with the buyers is at most €dmin. If so, the algorithm terminates. Otherwise, it 
goes to the next round. 

At termination, the algorithm gives the unsold goods to an arbitrary agent to en- 
sure that the market clears. It outputs the allocations received by all agents and the 
terminating prices p. Observe, however, that some of good j may have been sold at 
price (1 + €)p; even though the equilibrium price of good j is p;. Because of this 
descrepancy, agents will only get approximately optimal bundles. Lemma 5.25 will 
establish a bound on the approximation factor. 


Lemma 5.23. The number of rounds executed in an iteration is bounded by 


. (- (ae “Past . 
€ €dmin 


PROOF Observe that if outbid buys a good at price p;, it sells it at price (1 + 
€)p;, thereby decreasing the overall surplus. Therefore, in each round that is fully 
completed (i.e., does not terminate mid-way because of a price increase), the 
total surplus of agents is reduced by a factor of (1 + €). The total surplus at the 
beginning of the iteration is at most the total money possessed by all agents, i.e., 
NPmax- The iteration terminates (and in fact the algorithm terminates) as soon as 
the total surplus is at most €dmin. Therefore, a bound on the number of rounds in 
an iteration is 


NPmax 


Emin 


logis. 


Lemma 5.24 = The total number of iterations is bounded by 


n 
O (: log Pow) . 
E 


PROOF Each iteration raises the price of a good by a factor of (1 + €). Therefore 
the number of iterations is bounded by 


n logi4< Pmax- 
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Lemma 5.25 Relative to terminating prices, each agent gets a bundle of goods 
that provides her utility at least (1 — €)* times the utility of her optimal bundle. 


PROOF The algorithm always sells an agent her optimal goods relative to current 
prices p (recall, however, that at the time of the sale, an agent is charged a price 
of (1 + €)p; for good j). There are two reasons why an agent i may end up witha 
suboptimal bundle in the end. First, at termination, part of her money may remain 
unspent. Let M denote the total worth of i’s initial endowment at terminating 
prices. Assume that she spent M, of this. Since the total surplus money left at 
termination is at most €dmin, M; > (1 — €)M. 

The second reason is that some part of good j may have been sold at price (1 + 
€)p; to agent i, even though the equilibrium price announced is p;. Equivalently, 
we may assume that 7 gets her optimal goods at prices p for a fraction of her 
money. The latter is at least 


M, (l—«)M 4 
> >(l1-—eyM 
l+e l+e 


money. The lemma follows. 


Theorem 5.26 The algorithm given above finds an approximate equilibrium for 
the linear case of the Arrow—Debreu model in time 


mn NVUmax Umax 
O ( log log ) , 


€? € min Vmin Umin 


PROOF Observe that each good whose price is raised beyond 1 is fully sold. 
Since the total money of agents is the total worth of all goods at prices p, the 
condition that the total surplus money of agents is at most €dpin must be reached 
before the price of all goods increases beyond 1. Hence at termination, the price 
of at least one good is 1. 

Clearly, at termination, the ratio of maximum to minimum price of a good is 
bounded by Umax /Umin. Therefore, Pmax is bounded by vmax/Umin. Each round is 
executed in O(m) time. Now the bound on the total running time follows from 
Lemmas 5.23 and 5.24. 


5.13 Resource Allocation Markets 


Kelly considered the following general setup for modeling resource allocation. Let R 
be a set of resources and c: R > Z* be the function specifying the available capacity 
of each resource r € R. Let A = {a,..., dn} be a set of agents and m; € Z* be the 
money available with agent a;. 

Each agent wants to build as many objects as possible using resources in R. An 
agent may be able to use several different subsets of R to make one object. Let 
Sit, Si2,..., Sig, be subsets of R usable by agent a;, k; € Z* . Denote by x;; the number 
of objects a; makes using the subset S;;, 1 < j < k;; x;j is not rquired to be integral. 
Let fi = See , Xi; be the total number of objects made by agent a;. We will say that 
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fi, 1 <i <n is feasible if simultaneously each agent a; can make f; objects without 
violating capacity constraints on R. 

Kelly gave the following convex program and showed that an optimal solution to it 
satisfies proportional fairness; i.e., if f* is an optimal solution and f; is any feasible 
solution, then 


n hime 
» F <0. 


Intuitively, the only way of making an agent happier by 5% is to make other agents 
unhappy by at least a total of 5%. 


Maximize = m; log fi 


ajeA 
kj 
Subject to — Xij Va; € A 
4] fi 2, J (5.2) 
\> xij Scr) WER 
Gj):reS;j 
xij 20 Va € A, Ls j sk 


This general setup can be used to model many situations. The following are examples 
of situations of a combinatorial nature. 


(i) Market 1 (flow market): Given a directed or undirected graph G = (V, E), E is 

the set of resources, with capacities specified. Agents are source-sink pairs of nodes, 
(51, %1),---, (Sx, t%), with money mj ,..., mx, respectively. Each s; — 17; path is an 
object for agent (5;, f;). 

(ii) Market 2: Given a directed graph G = (V, E), E is the set of resources, with 
capacities specified. Agents are A C V, each with specified money. For s € A objects 
are branchings rooted at s and spanning all V. 

(iii) Market 3: Same as above, except the graph is undirected and the objects are spanning 
trees. 


Using KKT conditions, one can show that an optimal solution to this convex program 
is an equilibrium solution. Let p,,7 € R be Lagrangian variables corresponding to the 
second set of conditions; we will interpret these as prices of resources. By the KKT 
conditions optimal solutions to x;;’s and p,’s must satisfy the following equilibrium 
conditions: 


(i) Resource r € R has positive price only if it is used to capacity. 
(ii) Each agent uses only the cheapest sets to make objects. 
(iii) The money of each agent is fully used up. 


Since the objective function of convex program (5.2) is strictly concave, one can 
see that at optimality, the vector f,..., f, is unique. Clearly, this also holds for every 
equilibrium allocation. 


126 COMBINATORIAL ALGORITHMS FOR MARKET EQUILIBRIA 
5.14 Algorithm for Single-Source Multiple-Sink Markets 


In this section, we consider the special case of a flow market, Market 1, with a single 
source and multiple sinks. We will assume that the underlying graph is directed. In case 
it is undirected, one can use the standard reduction from undirected graphs to directed 
graphs — replace each undirected edge (u, v) with the two edges (u, v) and (v, u) of the 
same capacity. 

Formally, let G = (V, E) be a directed graph with capacities on edges. Let s € V 
be the source node and T = {t,,..., t,} be the set of sink nodes, also called terminals. 
Let m; be the money possessed by sink t;. The problem is to determine equilibrium 
flow and edge prices. The following example may help appreciate better some of the 
intricacies of this problem. 


Example 5.27 = Consider graph G = (V, E) with V = {s, a, b, c, d} and sinks 
b and d with $120 and $10, respectively. The edges are (s, a), (s,c) having 
capacity 2, (a,b) having capacity 1, and (a, d), (c,d), (c, b) having capacity 
10 (see Figure 5.5). The unique equilibrium prices are pysa) = $10, pia.) = 
$30, pys,c) = $40, and the rest of the edges have zero price. At equilibrium, flow 
on path s,a,d is 1, on s,a,b is 1, and on s,c, b is 2. Simulating the algorithm 
below on this example will reveal the complex sequence of cuts it needs to find 
in order to compute the equilibrium. Computing equilibrium for other values of 
money is left as an intersting exercise. 


We will present a strongly polynomial algorithm for this problem which is based 
on the primal-dual schema; i.e., it alternately adjusts flows and prices, attempting to 
satisfy all KKT conditions. Often, primal-dual algorithms can naturally be viewed as 
executing an auction. This viewpoint is leads to a particularly simple way of presenting 
the current algorithm. We will describe it as an ascending price auction in which the 
buyers are sinks and sellers are edges. The buyers have fixed budgets and are trying to 
maximize the flow they receive and the sellers are trying to extract as high a price as 
possible from the buyers. One important deviation from the usual auction situation is 


a l b 


$120 


c 10 d 


Figure 5.5. The network for Example 5.27. 
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that the sellers act in a highly coordinated manner — at any point in the algorithm, all 
edges in a particular cut, say (S, S), raise their prices simultaneously while prices of 
the remaining edges remain unchanged. The prices of all edges are initialized to zero. 
The first cut considered by the algorithm is the (unique) maximal min-cut separating 
all sinks from s, say (So, So). 

Denote by rate(t;) the cost of the cheapest s — ¢; path w.r.t. current prices. The flow 
demanded by sink ¢; at this point is m; /rate(¢;). At the start of the algorithm, when all 
edge prices are zero, each sink is demanding infinite flow. Therefore, the algorithm 
will not be able to find a feasible flow that satisfies all demands. Indeed, this will be 
the case all the way until termination; at any intermediate point, some cuts will need 
to be oversaturated in order to meet all the demand. 

The price of edges in cut (5, S) is raised as long as the demand across it exceeds 
supply; i.e., the cut is oversaturated because of flow demanded by sinks in S. At the 
moment that demand exactly equals supply, the edges in this cut stop raising prices and 
declare themselves sold at current prices. This makes sense from the viewpoint of the 
edges in the cut — if they raise prices any more, demand will be less than supply; i.e., 
the cut will be under-saturated, and then these edges will have to be priced at zero! 

The crucial question is: when does the cut (S, S) realize that it needs to sell itself? 
This point is reached as soon as there is a cut, say (U, U), with S Cc U, such that the 
difference in the capacities of the two cuts is precisely equal to the flow demanded by 
sinks in S — U (see Figure 5.6). Let (U, U) be the maximal such cut (it is easy to see 
that it will be unique). If U = V, the algorithm halts. Otherwise, cut (U, U) must be 
oversaturated — it assumes the role of (S, S) and the algorithm goes to the next iteration. 

Note that an edge may be present in more than one cut whose price is raised by the 
algorithm. If so, its price will be simply the sum of the prices assigned to these cuts. 

Suppose that the algorithm executes k iterations. Let (S;, S;) be the cut it finds in 
iteration i, 1 <i < k, with S,; = V. Clearly, we have Sg C S} C--- C Sy = V. Let T; 
be the set of terminals in S; — S;-1, for 1 <i <k. Let c; be the set of edges of G in 
the cut (S;, S;), for 0 <i <k and p; be the price assigned to edges in c;. Clearly, for 
each terminal t € T;, rate(t) = po +---+ p;-1, for] <i<k. 


Cut(S, S) Cut(U, U) 


Figure 5.6. The total flow demanded by t and t; equals the difference in capacities of cut 
(S, S) and cut (U, U). 
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Let G’ denote the graph obtained by adding a new sink node ¢ to G and edges (f;, f) 
from each of the original sinks to t. Let the capacity of edge (t;, t) be m;/rate(t;). For 
convenience, even in G’, we will denote V — S by S. It is easy to see that each of the 
cuts (S;, S; U {t}) in G’ has the same capacity, for 0 <i < k, and each of these k + 1 
cuts is amininimum s — f cut in G’. 

Let f’ denote a maximum s — t flow in G’. Obtain flow f from f’ by ignoring flow 
on the edges into t. Then f is a feasible flow in G that sends m; /rate(t;) flow to each 


sink f;. 


Lemma 5.28 Flow f and the prices found by the algorithm constitute an 
equilibrium flow and prices. 


PROOF We will show that flow f and the prices found satisfy all KKT condi- 

tions. 

* Since each of the cuts (5;, 5; U {t}), for 0 < i < k is saturated in G’ by flow f’, 
each of the cuts co, C1, ..., Cx—1 is saturated by f. Hence, all edges having nonzero 
prices must be saturated. 

¢ The cost of the cheapest path to terminal t’ € T is rate(t’). Clearly, every flow to t’ 
uses a path of this cost. 


¢ Since the flow sent to t’ € T is m;/rate(t’), the money of each terminal is fully 
spent. 


Below we give a strongly polynomial time subroutine for computing the next cut in 
each iteration. 


5.14.1 Finding the Next Cut 


Let (5, S) be the cut in G, whose price is being raised in the current iteration and let c 
be the set of edges in this cut and f its capacity. Let T’ denote the set of sinks in S. Let 
p’ denote the sum of the prices assigned to all cuts found so far in the algorithm (this 
is a constant for the purposes of this subroutine) and let p denote the price assigned to 
edges in c. The cut (5, S) satisfies the following conditions: 


¢ Jt is a maximal minimum cut separating T’ from s. 
e At p= 0, every cut (U, U), with § C U, is oversaturated. 


Let p* be the smallest value of p at which there is a cut (U, U), with S C U,inG 
such that the difference in the capacities of (S, S) and (U, U) is precisely equal to the 
flow demanded by sinks in U — S at prices p*; moreover, (U, U) is the maximal such 
cut. Below we give a strongly polynomial algorithm for finding p* and (U, U). 

Define graph G’ by adding a new sink node ft to G and edges (f;, tf) for each sink 
t; € S. Define the capacity of edge (t;, t) to be m;/(p’ + p) where m; is the money of 
sink f; (see Figure 5.7). As in Section 5.14 we will denote V — S by S even in G’. The 
proof of the following lemma is obvious. 
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Cut(S, S) Cut(U, U) 
priced at p 


Figure 5.7. Graph GC’. 


Lemma 5.29 = At the start of the current iteration, (S, SU {t}) is a maximal 
minimum s —t cut in G'. p* is the smallest value of p at which a new minimum 
s —t cut appears in G'. (U, U U {t}) is the maximal minimum s — t cut in G' at 
price p*. 


For any cut C in G’, let cap p(C) denote its capacity, assuming that the prices of edges 
in c is p. For p > 0, define cut(p) to be the maximal s — t min-cut in G’ assuming 
that the price assigned to edges in c is p. For cut (A, AU {t}), AC V, let price(A, AU 
{t}) denote the smallest price that needs to be assigned to edges in c to ensure that 
cap, (A, AU {t}) = f; ie. (A, AU {t}) is also a min s — ¢ cut in G’; if (A, AU {t}) 
cannot be made a minimum s —f cut for any price p then price(A, A U {t}) = 00. 
Clearly, price(A, A U {t}) > p*. Observe that determining price(A, A U {t}) involves 
simply solving an equation in which p is unknown. 


Lemma 5.30 = Suppose p > p*. Let cut(p) = (A, A U {t}), where A # U. Let 
price(A, A U {t}) = q and cut(q) = (B, B U {t}). Then B C A. 


PROOF Since we have assumed that A+~U, it must be the case that 
cap,(A, AU {t}) > f. Therefore, q= price(A, AU{t}) < p. Let ca, and cg de- 
note the capacities of (A, A U {t}) and (B, B U {t}) at price p = 0. Let my and 
mgs denote the money possessed by sinks in (A — S) and (B — S), respectively. 
Since (A, A U {t}) is a maximal s — t mincut at price p, 
Ma MB 


cat <cpc , 
P P 
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Subroutine 
Inputs: Cut (S, S) in G whose price is being raised in the current iteration. 
Output: Price p* and next cut (U, U). 


(i) C < (V, 1) 
(ii) p < price(C) 
(iii) While cut(p) 4 C do: 
(a) C <cut(p) 
(b) p < price(C) 
(iv) Output (C, p) 


Figure 5.8. Subroutine for finding next cut. 


Since (B, B U {t}) is a maximal s — ¢ mincut at price q, 


| mB | ma 
CB T < CA T 


The two together imply 


mZ—mMa mBp—ma 
< é 


q P 


First suppose that A C B. Clearly m4 < mg. But this contradicts the last 
inequality since g < p. 

Next, suppose that A and B cross. By the last inequality above, there must be a 
price, r, such that g < r < pat which cap,(A, A U {t}) = cap,(B, BU {t}) = g, 
say. By the submodularity of cuts, one of the following must hold: 

(i) cap,((A 1 B), (AN B) U {t}) < g. Since the money possessed by sinks in (AN 
B) — S is at most mg, at price q, cap,((A 1B), (AN B){t}) < cap, (B, BU {t}). 
This contradicts the fact that (B, B U {t}) is a min-cut at price q. 

(ii) cap,((A U B), (AU B) U {t}) < g. Since the money possessed by sinks in (A U 
B) — S is at least ma, at price p, cap,,((A UB), (AU B)U {t}) < cap (A, AU 
{t}). This contradicts the fact that (A, A U {t}) is a min-cut at price p. 


Hence we get that B C A. 


Lemma 5.31 Subroutine 5.8 terminates with the cut (U, U U {t}) and price p* 
in at most r max-flow computations, where r is the number of sinks. 


PROOF As longas p > p*, by Lemma 5.30, the algorithm keeps finding smaller 
and smaller cuts, containing fewer sinks on the s side. Therefore, in at most r 
iterations, it must arrive at a cut such that p = p*. Since cut(p*) = (U, U U {t}), 
the next cut it considers is (U, U U {t}). Since price(U, U U {t}) = p*, at this 
point the algorithm terminates. 
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Theorem 5.32. = The algorithm given in Section 5.14 finds equilibrium edge 
prices and flows using O(r?) max-flow computations, where r is the number of 
sinks. 


PROOF Clearly, the number of sinks trapped in the sets Sy C S} C--+ C Sx 
keeps increasing and therefore, the number of iterations k < r. The running time 
for each iteration is dominated by the time taken by subroutine (5.8), which 
by Lemma 5.31 is r max-flow computations. Hence the total time taken by the 
algorithm is O(r?) max-flow computations. By Lemma 5.28 the flow and prices 
found by the algorithm are equilibrium flow and prices. 


5.15 Discussion and Open Problems 


Linear utility functions provided us with perhaps the easiest algorithmic questions that 
helped us commence our algorithmic study of market equilibria. However, such func- 
tions are much too restrictive to be useful. Concave utility functions are considered 
especially useful in economics because they model the important condition of decreas- 
ing marginal utilities as a function of the amount of good obtained. Furthermore, if 
the utility functions are strictly concave, at any given prices, there is a unique optimal 
bundle of goods for each agent. This leads to the following remarkable communication 
complexity fact: In such a market, it suffices to simply announce equilibrium prices — 
then, all agents can individually compute and buy their optimal bundles and the market 
clears! 

On the other hand, concave utility functions, even if they are additively separable 
over the goods, are not easy to deal with algorithmically. In fact, obtaining a polynomial 
time algorithm for such functions is a premier open problem today. For the case of 
linear functions, the approach used in Section 5.8 — of starting with very low prices and 
gradually raising them until the equilibrium is reached — is made possible by the prop- 
erty of weak gross substitutability. This property holds for a utility function if on raising 
the price of one good, the demand of another good cannot go down. As a consequence 
of this property, the need to decrease the price of the second good does not arise. 

Concave utility functions do not satisfy weak gross substitutability. Exercises 5.5 
and 5.6 outline an approach that attempts to finesse this difficulty for the case of 
piecewise-linear, concave functions. Does this approach lead to an efficient algorithm 
for computing, either exactly or approximately, equilibrium prices for such functions? 
If so, one can handle a concave function by approximating it with a piecewise-linear, 
concave function. Alternatively, can one show that finding an equilibrium for such 
utility functions is PPAD-hard? 

Considering the properties of the linear case of Fisher’s model established in 
Theorem 5.1, one wonders whether its equilibrium allocations can be captured via 
a linear program. Resolving this, positively or negatively, seems an exciting problem. 
Another question remaining open is whether there is a strongly polynomial algorithm 
for computing equilibrium prices for this case. Finally, we would like to point to the 
numerous questions remaining open for gaining a deeper algorithmic understanding of 
Eisenberg—Gale markets (Jain and Vazirani, 2006). 
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5.1 


5.2 


5.3 


5.4 
5.5 


Exercises 


Give a strongly polynomial algorithm for Fisher's linear case under the assumption 
that all u;;’s are 0/1 (the algorithm given in Section 5.8 is not strongly polynomial). 


Let us extend Fisher's linear model to assume that buyers have utility for money 
(Vazirani, 2006). Let ujo denote the utility accrued by buyer / for one unit of money. 
Now, each buyer's optimal bundle can also include money—effectively this is part 
of their own money which they prefer not to spend at current prices. The notion of 
equilibrium also generalizes—all goods need to be sold and all money needs to be 
either spent or returned as part of optimal bundles. Extend the algorithm given in 
Section 5.8 to this situation, still maintaining its polynomial running time. 


Let us define a new class of utility functions, spending constraint utility functions 
for Fisher's model (Vazirani, 2006). As before, let A and B be the set of goods and 
buyers, respectively. For / € B and j € A, let ri : [0, e(7)] > R, be the rate function 
of buyer i for good j; it specifies the rate at which / derives utility per unit of / 
received, as a function of the amount of her budget spent on /. If the price of / is 
fixed at p; per unit amount of j, then the function r/p; gives the rate at which i 
derives utility per dollar spent, as a function of the amount of her budget spent on 
j. 
Relative to prices p for the goods, give efficient algorithms for 


(a) computing buyer i’s optimal bundle, 
(b) determining if p are equilibrium prices, and 
(c) computing equilibrium allocations if p are equilibrium prices. 


Prove that equilibrium prices are unique for the model of Exercise 5.3. 


It turns out that there is a polynomial time algorithm for computing equilibrium 
prices and allocations for the utility functions defined in Exercise 5.3 (Devanur and 
Vazirani, 2004; Vazirani, 2006). The following is an attempt to use this algorithm 
to derive an algorithm for computing equilibrium prices for the case of piecewise- 
linear, concave utility functions for Fisher’s model. 

Let fj; be the piecewise-linear, concave utility function of buyer / for good j; fi 
is a function of x;;, the allocation of good j to buyer /. Let p be any prices of goods 
that sum up to the total money possessed by buyers (as before, we will assume that 
there is a unit amount of each good in the market). 

Let us obtain spending constraint utility functions from the f;;’s as follows. Let 
gij be the derivative of fj;; clearly, g;; is a decreasing step function. Define 


lp ies u1) 
if (Vij) = & (4 
where y;; denotes the amount of money spent by i on good j. Observe that function 
hj; gives the rate at which i derives utility per unit of j received as a function of the 
amount of money spent on j. Hence hj; is precisely a spending constraint utility 
function. Let us run the algorithm mentioned above on these functions h;;’s to obtain 
equilibrium prices, say p’. 
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5.6 


5.7 


5.8 


5.9 
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Show that p = p’ iff prices p are equilibrium prices for the piecewise-linear, con- 
cave utility functions f;;’s (equilibrium prices for piecewise-linear, concave utility 
functions need not be unique). 


Open problem (Devanur and Vazirani, 2004): Consider the process given in Exercise 
5.3, which, given starting prices p, finds new prices p’. By the assertion made in 
Exercise 5.3, the fixed points of this process are precisely equilibrium prices for the 
piecewise-linear, concave utility functions f;;’s. 

Does this procedure converge to a fixed point, and if so, how fast? If it does 
not converge fast enough, does it converge quickly to an approximate fixed point, 
which may be used to obtain approximate equilibrium prices? 


Consider the single-source multiple-sink market for which a strongly polynomial 
algorithm is given in Section 5.14. Obtain simpler algorithms for the case that the 
underlying graph is a path or a tree. 


Observe that the algorithm given in Section 5.14 for Market 1 defined in Section 
5.13 uses the max-flow min-cut theorem critically Jain and Vazirani, 2006). Obtain 
a strongly polynomial algorithm for Market 3 using the following max—min theorem. 

For a partition V;,..., Vk, k => 2 of the vertices of an undirected graph G, let C 
be the capacity of edges whose end points are in different parts. Let us define the 
edge-tenacity of this partition to be C/(k — 1), and let us define the edge-tenacity 
of G to be the minimum edge-tenacity over all partitions. Nash-William (1961) and 
Tutte (1961) proved that the maximum fractional packing of spanning trees in G is 
exactly equal to its edge-tenacity. 


Next consider Market 2 defined in Section 5.13. For the case |A| = 1, a polynomial 
time algorithm follows from the following max—min theorem due to Edmonds (1967). 

Let G =(V, E) be a directed graph with edge capacities specified and source 
s € V. The maximum number of branchings rooted out of s that can be packed in 
G equals minyey C(v), where c(v) is the capacity of a minimum s — v cut. 

Next assume that there are two agents, 51, 52 € V. Derive a strongly polynomial 
algorithm for this market using the following fact from Jain and Vazirani (2006). Let 
F, and F> be capacities of a minimum s; — sz and sy — s; cut, respectively. Let F be 
Minyev—{s,,52} F’(v), where f’(v) is the capacity of a minimum cut separating v from 
sy and so. Then: 


(a) The maximum number of branchings, rooted at s; and s, that can be packed in 
G is exactly min{F; + Fo, F}. 

(b) Let f, and f, be two nonnegative real numbers such that f; < Fy, fo < Fo, and 
f, + fy < F. Then there exists a packing of branchings in G with f; of them 
rooted at s; and f) of them rooted at s2. 


CHAPTER 6 


Computation of Market 
Equilibria by Convex 
Programming 


Bruno Codenotti and Kasturi Varadarajan 


Abstract 


We introduce convex programming techniques to compute market equilibria in general equilibrium 
models. We show that this approach provides an effective arsenal of tools for several restricted, yet 
important, classes of markets. We also point out its intrinsic limitations. 


6.1 Introduction 


The market equilibrium problem consists of finding a set of prices and allocations of 
goods to economic agents such that each agent maximizes her utility, subject to her 
budget constraints, and the market clears. Since the nineteenth century, economists 
have introduced models that capture the notion of market equilibrium. In 1874, Walras 
published the “Elements of Pure Economics,” in which he describes a model for the state 
of an economic system in terms of demand and supply, and expresses the supply equal 
demand equilibrium conditions (Walras, 1954). In 1936, Wald gave the first proof of the 
existence of an equilibrium for the Walrasian system, albeit under severe restrictions 
(Wald, 1951). In 1954, Nobel laureates Arrow and Debreu proved the existence of an 
equilibrium under much milder assumptions (Arrow and Debreu, 1954). 

The market equilibrium problem can be stated as a fixed point problem, and indeed 
the proofs of existence of a market equilibrium are based on either Brouwer’s or Kaku- 
tani’s fixed point theorem, depending on the setting (see, e.g., the beautiful monograph 
(Border, 1985) for a friendly exposition of the main results in this vein). 

Under a capitalistic economic system, the prices and production of all goods are 
interrelated, so that the equilibrium price of one good may depend on all the different 
markets of goods that are available. Equilibrium models must therefore take into 
account a multitude of different markets of goods. This intrinsic large-scale nature of the 
problem calls for algorithmic investigations and shows the central role of computation. 

Starting from the 60’s, the intimate connection between the notions of fixed-point and 
market equilibrium was exploited for computational goals by Scarf and some coauthors, 
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who employed path-following techniques to compute approximate equilibrium prices 
(Eaves and Scarf, 1976; Hansen and Scarf, 1973; Scarf, 1967, 1982). In their simplest 
form these methods are based upon a decomposition of the price simplex into a large 
number of small regions and on the use of information about the problem instance 
to construct a path that can be shown to terminate close to a fixed point. While the 
appropriate termination is guaranteed by the fixpoint theorems, the worst case running 
time of these algorithms turns out to be exponential. 

Over the last few years, the problem of computing market equilibria has re- 
ceived significant attention within the theoretical computer science community. In- 
spired by Papadimitriou (2001), and starting with the work of Deng, Papadim- 
itriou, and Safra (2003), theoretical computer scientists have developed polyno- 
mial time algorithms for several restricted versions of the market equilibrium 
problem. 

In this chapter we focus on algorithms based on convex programming techniques. 
Elsewhere in this book (Vazirani, 2007), algorithms of a combinatorial nature are 
presented. 


6.1.1 Definitions: Models and Equilibrium 


We start by describing a model of the so-called exchange economy, an important special 
case of the model considered by Arrow and Debreu (1954). The more general one, 
which we will call the Arrow-Debreu model, includes the production of goods. We will 
deal with models with production in Section 6.6. 

Let us consider m economic agents that represent traders of n goods. Let R', denote 
the subset of R” with all nonnegative coordinates. The j-th coordinate in R” will 
stand for good j. Each trader i has a concave utility function u; : R'. > R4, which 
represents her preferences for the different bundles of goods, and an initial endowment 
of goods w; = (wil, .-., Win) € R’.. We make the standard assumption that u; is non- 
satiable, that is, for any x € R',, there is a y € R’ such that uj;(y) > u;(x). We also 
assume that u; is monotone, that is, u;(y) > u;(x) if y > x. For the initial endowment 
of trader i, we assume that w;; > O for at least one j. At given prices 7 € R‘., trader 
i will sell her endowment, and ask for the bundle of goods x; = (xj1, ..., Xin) € RY 
which maximizes u;(x) subject to the budget constraint! 7 - x < m - w;. The budget 
constraint simply says that the bundles of goods that are available to trader i are the 
ones that cost no more than her income z - w;. 

An equilibrium is a vector of prices 7 = (71,...,%,) € R4, at which, for each 
trader i, there is a bundle x; = (Xj1,..., Xin) € R', of goods such that the following 
two conditions hold: 


(i) Foreach trader i, the vector x; maximizes u;(x) subject to the constraints 7 - x <7 - w; 
and x € R‘. 
(ii) For each good j, 50; Xj; < )0; wij- 


' Given two vectors x and y, x - y denotes their inner product. 
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Let R'_, be the set of vectors in R”, whose components are strictly positive. For 
purposes of exposition, we will generally restrict our attention to price vectors in R',, . 
When we violate this convention, we will be explicit about it. 

For any price vector zr, a vector x;(77), which maximizes u;(x) subject to the budget 
constraint 7-x <7-w; and x €R’, is called a demand of trader i at prices z. 
Observe that there is at least one demand vector, and that there can be multiple demand 
vectors. We will usually assume that there is exactly one demand vector at price 7; 
that is, we have a demand function. This assumption holds if the utility function 
satisfies a condition known as strict quasi-concavity. Once again, we will be explicit 
when we will deal with exceptions, since for some common utility functions such as 
the linear ones, the demand is not a function but a correspondence or a set valued 
function. 

The vector z;(7) = x;(7) — w; is called the individual excess demand of trader 
i. Then X*(77) = >>; Xix(t) denotes the market demand of good k at prices 7, and 
Za) —2,4 Kr) — y wix the market excess demand of good k at prices 2. The vec- 
tors X() = (X'(),..., X"(1)) and Z(r) = (Z'(z), ..., Z"(z)) are called market 
demand (or aggregate demand) and market excess demand, respectively. Observe that 
the economy satisfies positive homogeneity, i.e., for any price vector z and any A > 0, 
we have Z(z) = Z(Az). The assumptions on the utility functions imply that for any 
price 2, we have z - x;(7) = m - w;. Thus the economy satisfies Walras’ Law: for any 
price 2, we have z - Z(zr) = 0. 

In terms of the aggregate excess demand function, the equilibrium can be equiva- 
lently defined as a vector of prices 7 = (71,..., 7) € R‘. such that Zi(a) <0 for 
each j. 


6.1.2 The Tatonnement Process 


The model of an economy and the definition of the market equilibrium fail to predict 
any kind of dynamics leading to an equilibrium, although they convey the intuition that, 
in any process leading to a stable state where demand equals supply, a disequilibrium 
price of a good will have to increase if the demand for such a good exceeds its supply, 
and vice versa. 

Walras (1954) introduced a price-adjustment mechanism, which he called tdaton- 
nement. He took inspiration from the workings of the stock-exchange in Paris, and 
suggested a trial-and-error process run by a fictitious auctioneer. The economic agents 
receive a price signal, and report their demands at these prices to the auctioneer. The 
auctioneer then adjusts the prices in proportion to the magnitude of the aggregate de- 
mands, and announces the new prices. In each round, agents recalculate their demands 
upon receiving the newly adjusted price signal and report these new demands to the 
auctioneer. The process continues until prices converge to an equilibrium. In its contin- 
uous version, as formalized by Samuelson (1947), the tatonnement process is governed 
by the differential equation system: 


d 
= GyZiln)), eS Ne Dok yy (6.1) 
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where G,() denotes some continuous and differentiable, sign-preserving function, and 
Z,( is the market excess demand function for good k. 


6.1.3 Approximate Equilibria 


Since a price equilibrium vector that is rational exists only in very special cases, most 
algorithms actually compute an approximate equilibrium. 


Definition 6.1 A bundle x; ¢ R'. is a p-approximate demand, for yu = 1, 
of trader i at prices mw if uj(x;) > wu and a -x; < um-w;, where u* = 
max{u;(x)|x € Ri, aw -x <2- uj}. 


A price vector z is a strong 4-approximate equilibrium (ju > 1) if there are bundles 
x; such that (1) for each trader i, x; is the demand of trader i at prices 7, and (2) )); xij < 
lL >.; wi; for each good j. A price vector mz is a weak f-approximate equilibrium 
(wu > 1) if there are bundles x; such that (1) for each trader i, x; is a jc-approximate 
demand of trader i at prices 2, and (2) 0; xij < w >_; wij for each good j. 


Definition 6.2. An algorithm that computes an approximate equilibrium, for any 
€ > 0, in time that is polynomial in the input size and 1/e (resp., log 1/¢) is called 
polynomial time approximation scheme (resp., polynomial time algorithm). 


6.1.4 Gross Substitutability 


In general, not only equilibria are not unique, but the set of equilibrium points may be 
disconnected. Yet many real markets do work, and economists have struggled to capture 
realistic restrictions on markets, where the equilibrium problem exhibits some structure, 
like uniqueness or convexity. The general approach has been to impose restrictions 
either at the level of individuals (by restricting the utility functions considered and/or 
by making assumptions on the initial endowments) or at the level of the aggregate 
market (by assuming that the composition of the individual actions is particularly well 
behaved). 

The property of gross substitutability (GS) plays a significant role in the theory of 
equilibrium and in related computational results based on convex programming. 

The market excess demand is said to satisfy gross substitutability (resp., weak 
gross substitutability [WGS]) if for any two sets of prices ma and z’ such 
that 0 <7; < 1, for each j, and 7; < 1 for some j, we have that m = 7; 
for any good k implies Dae) < Zn") (resp., Z'(a) < Z*(x')). In words, GS 
means that increasing the price of some of the goods while keeping some oth- 
ers fixed can only cause an increase in the demand for the goods whose price is 
fixed. 

It is easy to see that WGS implies that the equilibrium prices are unique up to scaling 
(Varian, 1992, p. 395) and that the market excess demand satisfies WGS when each 
individual excess demand does. 
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6.1.5 Special Forms of the Utility Functions 


A utility function u(-) is homogeneous (of degree 1) if it satisfies u(ax) = au(x), for 
alla > 0. 

A utility function u(-) is log-homogeneous if it satisfies u(ax) = loga + u(x), for 
alla > 0. 

Three popular examples of homogeneous utility functions are as follows. 


¢ The linear utility function, which has the form u;(x) = >> j UijXij- 

¢ The Cobb-Douglas function, which has the form u;(x) = J] jij)", where > (aij = 1. 

¢ The Leontief (or fixed-proportions) utility function, which has the form u;(x) = 
min j UjXij- 


We now define the constant elasticity of substitution functional form (CES, for 
short), which is a family of homogeneous utility functions of particular importance in 
applications. A CES function is a concave function defined as 


1 


n p 
U(X1,.-.,Xn) = bas ; 
i=l 


where the a;’s are the utility parameters, and —oo < p < 1, o £0, is a parameter 
representing the elasticity of substitution 1/1 — p (see Varian, 1992, p. 13). 

CES functions have been thoroughly analyzed in Arrow et al. (1961), where it has 
also been shown how to derive, in the limit, their special cases, i.e., linear, Cobb— 
Douglas, and Leontief functions (see Arrow et al., 1961, p. 231). For p > 1, CES 
take the linear form, and the goods are perfect substitutes, so that there is no pref- 
erence for variety. For p > 0, the goods are partial substitutes, and different values 
of o in this range allow us to express different levels of preference for variety. For 
p — 0, CES become Cobb-Douglas functions, and express a perfect balance be- 
tween substitution and complementarity effects. Indeed it is not difficult to show that 
a trader with a Cobb-Douglas utility spends a fixed fraction of her income on each 
good. 

For p < 0, CES functions model markets with significant complementarity effects 
between goods. This feature reaches its extreme (perfect complementarity) as p > 
—oo, i.e., when CES take the form of Leontief functions. 


6.1.6 Equilibrium vs Optimization 


In 1960, Negishi showed that equilibrium allocations of goods for an exchange economy 
can be determined by solving a convex program where the weights of the function to 
be maximized are unknown (Negishi, 1960). 

Negishi proved the following theorem. 


Theorem 6.3. Suppose that the initial endowment of each trader includes a 
positive amount of each good. 
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Given positive welfare weights a;,i = 1, ..., m, consider the convex program 


Maximize So aiui(xi) 
Subject to Es x S > wij, for | jn. 


There exist a; > 0,i=1,...,m, such that the optimal solutions x; to the 
program above with these a; are equilibrium allocations. That is, for some price 
vector 1, X; = x;(1) for each i. 


In the proof of Negishi’s theorem, the price vector z fora given set of welfare weights 
a; is obtained from the dual variables in the Karush—Kuhn—Tucker characterization of 
the optimal solution to the convex program. Whenever the utility functions are log- 
homogeneous, the Karush—Kuhn—Tucker characterization implies that a; is always 
equal to  - x;. For the welfare weights that correspond to equilibrium, we must then 
have a; = 7 - w;. 

Negishi’s characterization of the equilibrium has inspired certain algorithmic ap- 
proaches to compute it (Rutherford, 1999). It is also connected to some recent theoret- 
ical computer science work (Jain et al., 2003; Ye, in press). 


6.1.7 The Fisher Model 


A special case of the exchange model occurs when the initial endowments are pro- 
portional; i.e., when w; = 6;w, 46; > 0, so that the relative incomes of the traders 
are independent of the prices. This special case is equivalent to Fisher model, which 
is a market of n goods desired by m utility maximizing buyers with fixed incomes. 
In the standard account of Fisher model, each buyer has a concave utility function 
u; : R’. > R, and an endowment e; > 0 of money. There is a seller with an amount 
q; > 0 of good j. An equilibrium in this setting is a nonnegative vector of prices 
Ww =(m,...,Mn)€ R¢ at which there is a bundle x; = (x;1,..., Xin) € R¢ of goods 
for each trader 7 such that the following two conditions hold: 


(i) The vector x; maximizes u;(x) subject to the constraints 7 - x < e; andx € R‘. 
(ii) For each good j, 0; Xi; = q;- 


6.1.8 Overview 


The rest of this chapter is organized as follows. 

In Section 6.2, we analyze the Fisher model under the assumption that the traders are 
endowed with homogeneous utility functions, and present Eisenberg’s convex program 
for computing an equilibrium in such models. 

In Section 6.3, we consider exchange economies that satisfy weak gross substi- 
tutability, and show that, under such conditions, an important inequality holds, which 
implicitly gives a convex feasibility formulation for the equilibrium. We discuss algo- 
rithmic work that exploits this formulation. 
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In Section 6.4, we discuss convex feasibility formulations for exchange economies 
with some special and widely used utility functions, more precisely, linear and CES 
functions. 

In Section 6.5, we expose the limitations of convex programming techniques, by 
presenting examples where convexity is violated (the equilibria are multiple and dis- 
connected), and relating some of these examples to other equilibrium problems and to 
recently proven hardness results. 

In Section 6.6, we discuss convex feasibility formulations for economies that gen- 
eralize the exchange model by including production technologies. 

Finally, in Section 6.7, we guide the reader through the bibliography. 


6.2 Fisher Model with Homogeneous Consumers 


Whenever the traders have homogeneous utility functions, the equilibrium conditions 
for Fisher model can be rewritten as the solution to the following convex program 
(Eisenberg’s program), on nonnegative variables x;;: 


Maximize ye e; log uj(x;) 


L 


Subject to ae <q; foreach j. 


L 


Recall that u; is the i-th trader’s utility function, e; is the i-th trader’s endowment of 
money, and qg; is the amount of the j-th good. 

Notice that the program does not have variables corresponding to prices. The optimal 
solution to this program yields allocations for each trader that, at prices given by 
the Lagrangian dual variables corresponding to the optimal solution, are exactly the 
individual demands of the traders. We present a proof of this result for the case where 
the utility functions are differentiable. 

Let x be an optimal solution to Eisenberg’s program. Observe that u;(x;) > 0 for 
each i. The Karush-Kuhn—Tucker necessary optimality theorem (Mangasarian, 1969, 
Chapter 7.7) says that there exist 2; > 0, for each good j, and A;; > 0, for each trader 
i and good j, such that 


m7; (x ni) = a) =0 foreach good j, (6.2) 


U 


Aijxij =O for each i, j, (6.3) 


and 


ej Ou; (X;) 


4s) Bi) =a;—j foreachi, j. (6.4) 
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For trader i, let us multiply the j-th equality in (6.4) by x;;, and add the resulting 
equalities. We obtain 


ej _ d0u;(X;) = 
do) Ti = Dia — Aig )Xij- 


uj (X;) 


Using 6.3 and Euler’s identity u;(x;) = ae Xij oui for the homogeneous u;, this equality 
becomes , 


ei = ) TU jXij- 
J 


At the price vector z, the bundle x; thus exhausts the budget of trader i. Let y; € R' 
be any bundle such that z - y; < e;. We proceed along the lines of the Karush—Kuhn— 
Tucker sufficient optimality theorem (Mangasarian, 1969, Chapter 7.2) to show that 
uj(X;) > u;(y;). Using the concavity of u;, 


uj(yi) — Uji (Xi) S Vu(Xj) - Qi — X)) 


Uj (Xi) - 
= te SONG Ses 
S d (1j — Ap Vij — Xij) 
Uj (X;) 
ee ) (1 Vij — Aij Vij) — Gi 
j 


Uj (Xi) 
= Yo my — ei 
j 


We have shown that that x; is a demand of trader i at price 7. Turning now to market 
clearance, observe that (6.2) implies that }°; x;; = q; for any good j such that 2; > 0. 
For each good j such that 77; = 0, feasibility tells us that }°; %;; < qj; let us allocate 
the excess of any such good to trader 1. Slightly abusing notation, let x, still denote 
the first trader’s allocation. The bundle x; continues to be a demand of trader 1 at price 
zt, since the newly allocated goods have price zero and adding positive quantities of 
a certain good cannot decrease u;. We have now satisfied all the requirements of an 
equilibrium. 


6.3 Exchange Economies Satisfying WGS 


We now consider exchange economies that satisfy WGS. In this scenario the following 
important Lemma holds. 


Lemma 6.4 = Let at be an equilibrium price vector for an exchange economy 
that satisfies gross substitutability, and m be any nonequilibrium price vector. We 
then have wt - Z(m) > 0. 


EXCHANGE ECONOMIES SATISFYING WGS 143 


This lemma implies that the set of equilibrium prices forms a convex set by providing 
for any positive price vector 7 that is not an equilibrium price vector, a separating 
hyperplane, i.e., a hyperplane that separates z from the set of equilibrium prices. This 
is the hyperplane {x € i” | x - Z(a) = 0}: indeed we have 7 - Z(zr) > 0, whereas 
a - Z() =0, by Walras’ law. To compute this separating hyperplane, we need to 
compute the demands Z ;(zr) at the prices zr. 


6.3.1 Computational Results 


Lemma 6.4 tells us that if we start at price 2, and move in the direction Z(z), the 
Euclidean distance to the equilibrium 7# decreases. This observation is in fact the crux 
of the proof that a certain tatonnement process converges to the equilibrium. 

We now present a simple algorithm, which is a discrete version of the tatonnement 
process, and prove that it converges to an approximate equilibrium in polynomial time 
for exchange markets satisfying WGS. For this, however, we will need to work with a 
transformed market. 


Two Useful Transformations 


We now describe a transformation that, given the exchange market M, produces a new 
market M’ in which the total amount of each good is 1. The new utility function of 
the i-th trader is given by u/(x1,..., Xn) = uj(Wix1,..., WirXn), where W; denotes 
>>; wij. It can be verified that, if w;() is concave, then u/() is concave. The new initial 
endowment of the j-th good held by the i-th trader is w;,; = wi;/W;. Let w; denote 
(Wi, +++) Win) € RY. Clearly, Wi = D0; wij; = 1. 

The following lemma summarizes some key properties of the transformation. 


Lemma 6.5 

(i) For any pp > 1, (X41, .--, Xin) iS a L-approximate demand at prices (71, ..., Hn) 
for trader i in M’ if and only if the vector (W,x;1, ..., WnXin) is a [L-approximate 
demand at prices Gre nike ww) for trader i in M. 

(ii) For any w > 1, (m1, ..-., Hn) is a weak jx-approximate equilibrium for M' if and 
only if (qs +++» aE) is a weak jx-approximate equilibrium for M. 


(tii) The excess demand of M' satisfies WGS if the excess demand of M does. 


We transform M’ into another market M as follows. Let 0 < n < 1 be a parameter. 
For each trader 7, the new utility function and initial endowments are the same, i.e., 
a; = u‘Q, and w%; = w;. The new market M has one extra trader, whose initial 
endowment is given by W,4) = (y,...,7), and whose utility function is the Cobb— 
Douglas function um+1(%m+1) = | 1; eee. ;- A trader with this Cobb—Douglas utility 
function spends 1/n-th of her budget on each good. Stated precisely, 7 jXm41,j(7) = 
TT: Wm+i/ n. 

Note that the total amount of good j in the market M is W; = 7"! @; =14n. 
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Lemma 6.6 (1) The market M has an equilibrium. (2) Every equilibrium 1 of 
M satisfies the condition ~“! < 2n/n. (3) For any > 1, a weak j-approx 


min; Tj 


equilibrium for M is a weak (1 + n)-approx equilibrium for M'. (4) M satisfies 
WGS if M' does. 


PROOF Statement (1) follows from arguments that are standard in microeco- 
nomic theory. Briefly, a quasi-equilibrium 7 € R’. with a mj; = | always exists 
(Mas-Colell et al., 1995, Chapter 17, Proposition 17. BB.2). ‘At price z the income 
 - Wm+1 Of the Gi + 1)-th trader is strictly positive. This ensures that that 7; > 0 
for each good j. But this implies (Mas-Colell et al., 1995, Chapter 17, Proposition 
17.BB.1) that z is an equilibrium. 

The proofs of the remaining statements are left as Exercise 6.4. The proof of 
(2) illustrates one crucial role that the extra trader plays. 


We define A = {7 € R',|n/2n < mj < | for each j}. Note that Lemma 6.6 implies 

that M has an equilibrium price in A. We define At = {2 € Ri |n/4n <j < 1+ 
n/4n for each j}. For any 7 € At, we have ee Br ae oe < a 

Abusing notation slightly, we henceforth let’ Z(x) and "X (zr) cenoie: respectively, 


the excess demand vector and the aggregate demand vector in the market /. 


The Discrete Tatonnement Process 


We now state an algorithm for computing a weak (1 + €)-approximate equilibrium for 
M. From Lemma 6.5 and Lemma 6.6 (applied with 7 = «), this (1 + ¢)-approximate 
equilibrium for M will then be a (1 + O(e))-approximate equilibrium for M. The 
algorithm assumes access to an oracle that can compute the excess demand vector of 
M at any given price vector in At. Such an oracle is readily constructed from an oracle 
for computing the excess demand for M. 

Let 7°, the initial price, be any point in A. Suppose that we have computed a 


sequence of prices 7°, ...,'~!. We compute z' as follows. au m'—! ¢ At, we let 


i-1 


me be the point in A closest to 2‘. In other words, nv =n, ‘if n/2n < ae 
1 pa ditne: Pee 1) | = n/2n if xi Y <n f2n. 
If xz’! € At, we let 
gyi tn _ 98 aq i-1) 
(12n?/n) 


Analysis of Convergence 


Lemma 6.4 is the building block upon which the proof of convergence of the (con- 
tinuous) tatonnement process is based. To prove the (fast) convergence of the discrete 
process just described, we need a more general result (Lemma 6.7 below). Together 
with Lemma 6.8, it says that if a vector x € At is not a weak (1 + €)-approx equilib- 
rium for M, then the hyperplane normal to Z(zr) and passing through z separates 7 
from all points within a certain distance of any equilibrium of M in A. 


EXCHANGE ECONOMIES SATISFYING WGS 145 


Lemma 6.7 Let « € At be a price vector that is not a weak (1+ €)- 
approximate equilibrium for M, for some ¢ > 0. Then for any equilibrium # € A, 
we have # - Z() => 6 > 0, where 1/6 is bounded by a polynomial in n, i and a 
PROOF We can assume that og goods are ordered so that a Se a 
Let a, denote the quantity =. For 1 < s <n, let qg* denote the price vector 
min{a,7, 7}, Le., the componentwise minimum of a,7 and z. Note that 


= (m1, see, Ts—1, Us = Asis, As Tes41, sees As Ftp). 


The first price g; in the sequence is an equilibrium price vector, being a scaling 
of # by a, and the last price vector g, is m. For 1 < s <n -—1, let Gi denote 
the set of goods {1,...,s5} and G‘ denote the set of goods {s+ 1,...,n}. If 
Os < Asa, Gh is the subset of goods whose prices remain fixed during the s-th 
step, where we move from q° to q°*!, and G‘ is the complement set. 

Focusing on the s-th step, we have 


0O= — gr : Fig) _ q° . Z(q°) 
= So a (Z@"*)— Z;@)) + >- G8 )Z)¢"") — a; jZ;)) 


jeGh JEG) 
= O54 Loti (Z;(q°*") — Z;(g°)) +) (@s41 — @s)#jZ (9°) 
jeGt 
- Gace PIA Y= Z7@))s 
jeGh 


Applying weak GS to the price vectors g* and a,7%, we see that Z;(q*) < 0 
for j € G‘. Applying weak GS to the price vectors qg° and q**!, we see that 
Z; (get) > Z,(q°) for j € Gh, Noting that 7; < a%j < 541%; for j € G!, we 
have 


Gea )_ By (Zj(@*t") — Z;Q")) 
J 


= So sit; — 1) (Z)(g"*') — Z\(q°)) 


jeG! 
— So (@s41 = @5)#)Z)(q") 
ject 
> Do (sy) — 25) (Zj(q'*") — Z;(Q")) 
jeG! 
> Gn =e) (ZG Zi@)), 


jeG} 


That is, 


a, ) >) 4) (Ze) — Zi@")) 6.5) 
As41 


ft (Z,(q°") — Z;*)) = (1 = 
eG} 
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Since the right-hand side is nonnegative, we have, for each 1 < 5s <n—1, 
# (Z(G) = Z(q°)) = 0. (6.6) 


Because 7 = q” is nota weak €-approximate equilibrium for /, we must have 
= > 1+ «/3. (See Exercise 6.5.) So there is some value 1 < k <n — 1 so that 
HL > 1+ €/6n. We will show that the right-hand side of equation (6.5) is large 
for k. 

We have | a = ao = io : 

We can lower bound that the increase in income of the (m + 1)-th trader when 
we move from gé to g*+!; 


k+1 
g) . @m41 — Om = (gi 
EA , 
= — Tn Wnt, ne 
6n 


k+1 ky > nA A 
+t 4G, )Win+1,n = (x41 = Ak) nWm+i.n 


Recall that the (m + 1)-th trader is a Cobb—Douglas trader with a utility func- 
tion that ensures that she spends ith of her income on each good. As a result, we 
have 


k+l. ko 
k+1 ky _ ee Dyn q - Wm+1 
Xm iG) — Xm4i11(") = na ; 
nq, nq, 
a ee koa 
= (q »Wm+1 — G+ Wms) 
NIT, 


EAL) Wm+in 
6n27| 
Since the market M’ (the one without the (m + 1)-th trader) satisfies weak GS 
and 1 € G", we have 


EAT 


ag) a Sox > 0. 
(=i j=1 
AE " Plugging 


Adding the two inequalities, we get Z i(q*t!) —Z (q*) cee ar 
this into equation (6.5), and recalling that Zig") — Z(q") >0 fol j€G",we 
have 


ft -(Zj(q**") — Z,(q")) = (1 - =) \- #; (Ziq!) — Z(q") 


jeGi 


Adding this inequality and the inequalities (6.6) for each s ~ k, we get 


ft - Z(n) = f -(Z(q") — Z(q iy Cen 
72n77 


It is easily verified that 1/5 is bounded by a polynomial in n, 1/e, and 1/7. 


Lemma 6.8 For any x € At, ||Z(s)||2 < 12n?/n. 


EXCHANGE ECONOMIES SATISFYING WGS 147 


PROOF 


Zr Jl2 < YZ j(x)| 

J 

Y> Xj) + DW; 
y ") 


Max, We A A 
a YW) +0 Wy; 
ming 1 : 
J J 


IA 


5n A i 
s oe Da 
J J 

10n? 

= +2n 
12n? 

< pS 
n 


where the third inequality follows from a simple calculation, the fourth inequal- 
ity holds because z € At, and the fifth inequality holds because W; < 2 for 
each j. 


We are now ready for the proof of correctness of the discrete tatonnement process. 


Theorem 6.9 Let ju denote min{ a5 


Ti? (n/4n)’}. Within n/ iterations, the 
algorithm computes a price in At which is a weak (1 + €)-approximate equi- 
librium for M. (Note that the bound on us is polynomial in the input size of the 


original market M, 1/¢, and 1/7.) 


PROOF Let us fix an equilibrium 2* of M in A. We argue that in each iteration, 
the distance to z* falls significantly so long as we do not encounter an approximate 
equilibrium in At. If z‘~! ¢ A*, we have ae =| = [zx — x7| > 0 for each 
-1 


J, while [zx = >| = [zx = 7; | > n/4n for some j. From this it follows that 


I|a* — a P — [Ie* — a P > Any. 
Now suppose that z‘/~' ¢ A* and that z'~! is not a weak (1 + ¢)-approx 
equilibrium for M. By Lemma 6.7, 2* - Z(z‘~!) > 6. Since z‘~! - Z(z'~!) = 0 
by Walras’ Law, we have (7* — w'~!)- Z(ar'!) > 6. 


‘ ; a, 
Let g denote the vector x! — 7’~! = ame Ze '). We have 


(x* —x*" —q)-q 
=@* =n" -q-¢-¢ 


é * i-1 i-1 é i—1ly));2 
= Can2 Jn? («x i JeAGe oy = Gn? jn |Z 1) 
> : ¢ : 1212/1) >0 
~ (120? ]n)? (12n?/n)? aa 
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Thus, 
[|z* — "PP = |[x* — |? 
= |[e* = NP = |e* — 2?" — QP? 
=(n*—2'):q+@" =a" +9) +4 
Sanat eg 
5 foi a 
~ Cape 20 
$2 
= tae oe te 
~ (12n?/n)? 
Suppose that every vector in the sequence 7°, ..., 2* is either not in At or 


not a weak (1 + €)-approx equilibrium. We then have 
2 


I|n* — ai"? — ||n* —n' |? > min | 
(12n2/n)? 


in/an?| =i, 
for 1 <i <k. Adding these inequalities, we get 


Pan 


ku <||\x* —2°|? —||\x* —2 


Putting everything together, we can state the main result of this section. 


Theorem 6.10 Let M be an exchange market whose excess demand function 
satisfies WGS, and suppose that M is equipped with an oracle for computing the 
excess demand at any given price vector. For any € > 0, the tatonnement-based 
algorithm computes, in time polynomial in the input size of M and 1/, a sequence 
of prices one of which is a weak (1 + €)-approx equilibrium for M. 


In order to actually pick the approximate equilibrium price from the sequence of 
prices, we need an efficient algorithm that recognizes an approximate equilibrium of M. 
In fact, it is sufficient for this algorithm to assert that a given price z is a weak (1 + 2¢)- 
approximate equilibrium provided z is a weak (1 + €)-approximate equilibrium. Since 
the problem of recognizing an approximate equilibrium is an explicitly presented 
convex programming problem, such an algorithm is generally quite easy to construct. 


6.4 Specific Utility Functions 


In many economic scenarios, the market is modeled by consumers having some specific 
utility functions. While in some cases this does not lead to a simplified computational 
problem, in other instances, the specific utility functions might expose a computation- 
ally useful structure. This turns out to be the case for linear utility functions, as well as 
for certain CES utility functions. 
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6.4.1 Convex Programs for Linear Exchange Economies 


The equilibrium conditions for an exchange economy with linear utilities can be written 
as a finite convex feasibility problem. Suppose that the linear utility function of the i-th 
trader is )’ ; aijxij, and suppose that w;; > 0 for each i, j. 

Consider now the problem of finding w; and nonnegative x;; such that 


) AikXik = Gij ) wpe", foreach 1 <i <m,1< j<n. 
k k 

) xi = ) Uj- 

i i 


Any solution to this program corresponds to an equilibrium obtained by setting 
1; = e¥i, The converse also holds, i.e., any equilibrium corresponds to a solution to 
this program. 

We will discuss the ideas behind the derivation of the convex program above in the 
context of economies with production (Section 6.6). 


6.4.2 Convex Programs for CES Exchange Economies 


Demand of CES Consumers. We start by characterizing the demand function of 
traders with CES utility functions. Consider a setting where trader i has an ini- 
tial endowment w; = (wj1,..., Win) € R4. of goods, and the CES utility function 
Uj(%j1,..-,Xin) = (ot aijxP)™, where aj; > 0, wij > 0, and —oo < p; < 1, but 
p; # 0. If p; < 0, we define u;(xj1,..., Xin) = 0 if there is a j such that x;; = 0. Note 
that this ensures that u; is continuous over R‘,. 

The demand vector for the i-th consumer is unique and is given by the expression 


1/1—p; 
OF; . yoy Me Wik 
1/1—p; 1/1-p;__—pi/1—pi * 
Tj ye % Ty 


(6.7) 


Xij(w) = 
The formula above can be derived using the Karush—Kuhn—Tucker conditions. 


Efficient Computation by Convex Programming. Consider an economy in which 
each trader i has a CES utility function with —1 < p; < 0. We show that the equilibria 
of such an economy can be characterized as the solutions of a convex feasibility 
problem. 

Since the demand of every trader is well-defined and unique at any price, we may 


write the equilibria as the set 7 € R44 such that for each good j, we have )°; xij(7) < 
1/(—p) 
j 

oj = me In terms of the o;’s, we obtain the set of o = (01, ...,O,) € R44 such 
that for each good j, 


>); wij. Let op = —1, and note that p < p;, for each i. Let f(j(7) = 1 xij(7r), and 


Y> fii) <0; (x vi) . 


i 
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We now show that these inequalities give rise to a convex feasibility program. Since 
the right-hand side of each inequality is a linear function, it suffices to argue that the 
left-hand side is a convex function. The latter claim is established by the following 
proposition. 


Proposition 6.11 The function f;;(o) is a convex function over R,.. 


PROOF Clearly, it suffices to show that the constraint f;; < t defines a convex 
set for positive t. Using formula (6.7) for the demand, this constraint can be 
written as 


I~ p; l-p 
Ot; ee | Wik 
x a4 
eb 1 aay SF 


1=p; 1p; 


G; : dee Gig” O% 


Rewriting, and raising both sides to the power 1/(1 — pe), we obtain 


l-p Pi? i 


1 
T=p)\I=97) l-p 1 a=). a; 
a; eK yo, Wik <tho, v, ", (6.8) 
k 


where 


we. 6. 2 (6.9) 


The left-hand side of inequality 6.8 is a convex function, and the right-hand 
side is a concave function that is nondecreasing in each argument when viewed as 
a function of t, 0;, and v;, since the exponents are nonnegative and add up to one. 
Since 0 < =a) < 1, the right-hand side of equality 6.9 is a concave function, 
in fact a CES function. It follows that the right-hand side of inequality 6.8 remains 
a concave function when v; is replaced by the right-hand side of equality 6.9. This 


completes the proof. 


It is not hard to verify that the demand generated by an economy with CES util- 
ities as above need not satisfy WGS. Indeed, the connectedness of the equilibria 
that is a corollary of the above convex feasibility formulation is an interesting new 
consequence. 


6.5 Limitations 


So far, we have presented efficient algorithms for restricted versions of the market 
equilibrium problem, which take advantage of the convexity of the set of equilibria. 
However, the set of equilibria in a general exchange economy does not even need to be 
connected. This implies that it is not possible to characterize the set of equilibria by a 
convex formulation. 
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In Section 6.5.1 we report an example that shows that CES exchange economies 
may present multiple disconnected equilibria, whenever p < —1. This suggests that 
it is unlikely that the results shown in Section 6.4.2 can be extended to encompass 
markets where some traders have CES utility functions with o < —1. 

In Section 6.5.2 we outline some more general obstacles to the efficient solvabil- 
ity of the market equilibrium problem. More precisely, we give a tour of a num- 
ber of recent computational complexity results which imply that Leontief exchange 
economies are hard for PPAD, a complexity class that contains a wealth of equi- 
librium problems. This shows that it is unlikely that the market equilibrium problem, 
even when restricted to exchange economies with Leontief consumers, can be solved in 
polynomial time. 


6.5.1 Multiple Disconnected Equilibria 


We describe a simple market with two traders and two goods that has multiple dis- 
connected equilibria. The first trader has an initial bundle w; = (1,0) and the CES 
utility function u4(x, y) = ((ax)? + y°)!/, where a > 0. The second trader has an 
initial bundle w2 = (0, 1) and the CES utility function u2(x, y) = ((x/a)? + y?)!/”. It 
is possible to show that for each p < —1 there is a sufficiently small value of a for 
which 


(i) the vector (1/2, 1/2) is an equilibrium price and 
(ii) the vector (p, 1 — p) is an equilibrium price for some p < 1/2, and the vector (qg, 1 — 
q) is not an equilibrium price for any p < q < 1/2. 


This economy therefore does not admit a convex programming formulation in terms of 
some “relative” of the prices (such as the one given in Section 6.4.2 in terms of the ox) 
that captures ail the price equilibria. Such a formulation implies that if (p;, 1 — pi) 
is a price equilibrium and (p2, 1 — p2) is a price equilibrium for some p; < pz, then 
(p3, 1 — p3) is also a price equilibrium for every p; < p3 < Pp. 

This example suggests that it may not be possible to extend convex programming 
techniques to encompass markets where some traders have a CES utility function with 
p<-l. 


6.5.2 Hardness for the Class PPA D 


The context of computation of equilibria calls for a complexity analysis conducted 
within the class TF'NP of total search problems, i.e., problems whose set of solutions 
is guaranteed to be non empty. Nash Theorem guarantees that the problem of finding a 
Nash equilibrium in a noncooperative game in normal form is a total search problem. 
Arrow and Debreu Theorem gives sufficient conditions under which an exchange econ- 
omy has an equilibrium. Therefore, under suitable sufficient conditions, the problem 
of finding a market equilibrium is a total search problem. 

An important subclass of TF'NP is the class PPAD, which is the class of total 
functions whose totality is proven by the following simple combinatorial argument: if a 
directed graph whose nodes have in-degree and out-degree at most one has a source, it 
must have a sink (see Chapter 2 of this book for more background, Papadimitriou, 2007). 
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This class captures a wealth of equilibrium problems, e.g., the market equilibrium 
problem as well as Nash equilibria for n-player games. Problems complete for this 
class include a (suitably defined) computational version of the Brouwer Fixed Point 
Theorem. 

Consider exchange economies where m, the number of traders, is equal to the 
number of goods, and the i-th trader has an initial endowment given by one unit of 
the i-th good. The traders have a Leontief (or fixed-proportion) utility function, which 
describes their goal of getting a bundle of goods in proportions determined by m given 
parameters. 

Given an arbitrary bimatrix game, specified by a pair of n x m matrices A and 
B, with positive entries, one can construct a Leontief exchange economy with n + m 
traders and n + m goods as follows. 

Trader i has an initial endowment consisting of one unit of goodi, fori = 1,...,n+ 
m. Traders indexed by any j € {1,...,} receive some utility only from goods j € 
{n+1,...,2-+m)}, and this utility is specified by parameters corresponding to the 
entries of the matrix B. More precisely the proportions in which the j-th trader wants 
the goods are specified by the entries on the jth row of B. Vice versa, traders indexed 
by any j € {n+1,...,n +m} receive some utility only from goods j € {1,..., 7}. 
In this case, the proportions in which the j-th trader wants the goods are specified by 
the entries on the jth column of A. 

In the economy above, one can partition the traders in two groups, which bring to 
the market disjoint sets of goods, and are interested only in the goods brought by the 
group they do not belong to. 

It is possible to show that the Nash equilibria of any bimatrix game (A, B) are in 
one-to-one correspondence with the market equilibria of such an economy, and that 
the correspondence can be computed in polynomial time. (For the Leontief economies 
under consideration, we need to get rid of the assumption — see the Introduction — 
that we will be concerned only with positive price equilibria. It is only then that they 
capture the complexity of bimatrix games.) 

The problem of computing a Nash equilibrium for two-player nonzero sum games 
have been proven PPA D-complete. Combined with the game-market correspondence 
mentioned above, these hardness results imply that the problem of computing a market 
equilibrium, even when confined to the restrictive scenario of a special family of 
Leontief economies, is PPA D-complete. 


6.6 Models with Production 


In this section, we derive convex programs for certain economies that generalize the 
exchange model by including constant returns to scale technologies. The ideas for 
deriving these convex programs build on the ones developed for exchange economies 
with special utility functions. In a constant returns economy M, there are € producers, 
as well as the m consumers and n goods of the exchange model. The k-th producer is 
equipped with a technology that is capable of producing some good, say ox, using the n 
goods as input. The technology is specified by a concave function f; : R'. > R, that 
is assumed to be homogeneous of degree 1. The interpretation is that given quantity 


MODELS WITH PRODUCTION 153 


z; = Oof good j, for 1 < j <n, the technology can produce up to f;(z1,..., Zn) units 
of good ox. 

At a given price vector 7 = (711,..., 7) € R", the producer will choose a techno- 
logically feasible production plan that maximizes her profit. That is, she will choose 
Z1,+-+, Zn = O that maximizes the profit 7,, fi(Z1,.--5 Zn) — vA jz ;. Now if there 
is a choice of nonnegative z;,...,Z, such that 7, f(Z1,---, Zn) — i jz; > 0, 
then using inputs a@z,,...,@Z,, for a@ > 1, she can obtain a profit of 


n n 
To, S(OZ1, a) Zn) =~ Yo maz; =a To, fk(Z15 sey Zn) = So 32; 
j=l 


j=l 


Thus a profit-maximizing plan is not defined in this case. A profit-maximizing plan is 
defined if and only if no feasible plan can make a strictly positive profit. In such a case, 
a profit-maximizing plan is one that makes zero profit. In particular, the trivial choice 
zj =0, for 1 < j <n, for which f,(z1,...,Z,) =0 is always a profit-maximizing 
plan whenever profit maximization is well defined. 

It is useful to restate the above in terms of the unit cost function c, : R'. > R,. 
This is defined, at any given price vector (711, ..., 7%) € R'_, to be the minimum cost 
for producing one unit of good ox. That is, 


n 
cx(r) = min Yo mjzjlzj > 0, fr(Z1,.--,2Zn) = 1 
j=l 


If 2, > cx(zr), then profit maximization is undefined. If 7, < c,(z), then the only 
profit-maximizing plan is the trivial plan. If 2, = c,(z), the trivial plan, as well as any 
(x1,...,%X,) such that f,(z1,..., Zn)ex(@) = Viet It jZ;, 18 a profit-maximizing plan. 

Each consumer is identical to the one in the exchange model: she has an initial 
endowment w; € R', and a utility function u;, which we now assume to be homoge- 
neous. An equilibrium is a price vector 7 = (71,...,,) at which there is a bundle 
Xj; = (Xi1,---, Xin) € R4_ of goods for each trader i and a bundle z, = (Zk1, -.- 5 Zkn) € 
R', for each producer k such that the following three conditions hold: (i) For each 
firm k, profit maximization is well-defined at 2 and the inputs zy, = (Zx1, .--, Zen) and 
output Go, = fx(Zk1,---» Zkn) 1S a profit-maximizing plan; (ii) for each consumer 7, 
the vector x; is her demand at price z; and (iii) for each good j, the total demand is no 
more than the total supply; i.e., the market clears: 


ae +50 cj < So wij + = dkj- 
i k i 


k:j=ox 


Note that requirement (i) means that there is no feasible plan that makes positive 
profit. This rules out the trivial approach of ignoring the production units and computing 
an equilibrium for the resulting exchange model. 

We now derive a convex program for certain kinds of utility and production functions. 
We first transform the economy M into an economy M’ with m consumers, n + m 
goods, and / + m producers. For each consumer 7, an additional good, which will 
be the (7 + 7)-th good, is added. The new utility function of the i-th consumer is 
ui(X1,.-.,Xn4+m) = Xn4i3 that is, the i-th consumer wants only good n + i. The new 
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initial endowment w; is the same as the old one; that is Wij = w;; if j < n,and Wij =0 
if 7 >n. The first i producers stay the same. That is, foi k <1, the k-th rsrodhieer 
outputs good o;, using the technology described by the function f/(z1,.--, Znim) = 
Sk(Z1, +--+, Zn). For 1 <i <m, the (J +1)-th producer outputs good n +i using the 
technology described by the function fi, (21, .e +5 Zntm) = Uj(Z1,--+5 Zn). It can be 
shown that there is a one-to-one correspondence between the equilibria of M and M’. 
We will therefore focus on characterizing the equilibria of M’ — the simplicity of its 
consumption side will be of considerable help in this task. 


6.6.1 Inequalities Characterizing Equilibrium 


We begin by characterizing the equilibria for the market M’ in terms of a system 
G of inequalities, in the following sets of nonnegative variables: (1) 7,...,2n4m> 
for the prices; (2) x;.n4;, for the demand of consumer i for the (n + i)-th good; (3) 

= (Zk1,--+ 5 Zen) € R“_, standing for the inputs used by the k-th production sector; 
and (4) qxo,, for the output of the good o, by the k-th producer. 


n 


TytiXinti = Yo mj wi, for 1 < i <m (6.10) 
j=l 
ko, < fr(Ze), fort <k<l+m (6.11) 
To, <cCx(1,.-.,%,), forl<k<l+m (6.12) 
oe < ym + Do aj, forl<j<n (6.13) 
k k:op=Jj 
Xinti S Q4insi forl <ism (6.14) 


Here, c,() denotes the k-th producer’s unit cost function, which depends only on 
the prices of the first n goods. Evidently, any equilibrium is a feasible solution to the 
system of inequalities G. What is not so evident is that any feasible solution of G is 
an equilibrium. To see this, we first note that the sets of inequalities (6.12) and (6.13) 
imply that no producer can make positive profit: we have )> jen Mik 2 Wor Tor for 
each producer k. Adding up these inequalities, as well as the inequalities (6.10), we 
get a certain inequality that says that the cost of the consumer and producer demands 
is greater than or equal to the cost of the initial endowments and producer outputs. 
Whereas by multiplying each inequality in (6.13) and (6.14) by the corresponding price 
and adding up these inequalities, we get that the cost of the consumer and producer 
demands is less than or equal to the cost of the initial endowments and producer 
outputs. 

This implies that the two costs must be equal. From this it follows that }* jen Tjrkj = 
Io, 4ko, for each producer k. Each production plan makes zero profit. Since (6.12) 
ensures that profit maximization is well defined, these are optimal production plans. 
Furthermore, we must have equality in (6.10): x;,n4; is the demand of good n +i at 
price z. Since conservation of goods is guaranteed by (6.13) and (6.14), we conclude 
that any solution of G is an equilibrium. 
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6.6.2, Convex Programs for Specific Functions 


Let us make the substitution 27; = e”’ in the system of inequalities above. This makes 
all the constraints convex, except possibly for the ones in (6.12). Whenever each 
inequality in the set (6.13) also becomes a convex constraint, we get a convex feasibility 
characterization of the equilibrium prices. 

Let us first consider what happens to the constraint in (6.12) corresponding to 
a CES production function fi(z1,.--, 2) = (0; axjxy)'!?, where 0 < p < 1. The 
corresponding constraint is 7, < c,(7) = On} Gmc yo; where o = 1/(1 — p) 
(we use a standard expression for the cost function corresponding to the CES production 
function f;,). Raising both sides to the power (1 — o), and noting that 1 — o < 0, this 
constraint becomes 


It is now easy to see that the substitution 7; = e”/ turns this inequality into a convex 
constraint. 

It is also easy to verify, using standard formulas for the cost functions, that the 
constraint in (6.12) corresponding to a linear or a Cobb—Douglas production function 
also becomes convex after the substitution 7; =e”. 

Thus, we obtain convex programs characterizing the equilibria in constant returns 
economies where the utility and production functions are linear, Cobb—Douglas, or CES 
with p > 0. The approach also works for a certain family of nested CES functions. 
Interestingly, the use of production technologies to simplifying the consumption side 
plays a key role in obtaining convex programs for pure exchange economies with nested 
CES utility functions. 


6.7 Bibliographic Notes 


The convex program of Section 6.2 is due to Eisenberg (1961). Generalizing an ap- 
proach due to Eisenberg and Gale (1959) and Gale (1960) for linear utilities, Eisenberg 
(1961) shows how to write the equilibrium conditions for the Fisher model as the so- 
lution to a convex program whenever the traders have homogeneous utility functions. 

Eisenberg’s program can also be seen as following from Negishi’s theorem. However 
Eisenberg establishes an arguably stronger result. Without loss of generality, assume 
>); ei = 1. Consider the social utility function u : R', — R that assigns to each s € R4. 
the value 

max | Juic@a* |x, <€ Ri, ya <5 
i=l i 

Eisenberg shows that u is homogeneous and concave, and that at any price vector z 
the market demand generated by the Fisher economy with m traders is identical to the 
demand of a single trader with utility function u and income 1. 
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Polterovich (1973) extends Eisenberg’s program to a generalization of the Fisher 
model that includes production. Jain et al. (2005) generalize this result to quasi-concave, 
homothetic, utilities, and also consider economies of scale in production. 

Lemma 6.4 of Section 6.3 has been proven by Arrow et al. (1959) under the stronger 
assumption of GS. It was later shown to generalize to markets which satisfy only WGS 
(Arrow and Hurwicz, 1960a, 1960b). 

Polterovich and Spivak (1983) extended the characterization of Lemma 6.4 to sce- 
narios where the demand is a set-valued function of the prices, which includes in 
particular the exchange model with linear utilities. This extension says that for any 
equilibrium price #, and nonequilibrium price 7, and any vector z € R" that is chosen 
from the set of aggregate excess demands of the market at 7, we have 7 - z > 0. 

The simple algorithm of Section 6.3.1, which is a discrete version of the tatonnement 
process, is introduced and analyzed in Codenotti et al. (2005). Lemma 6.7 can also 
be used with the Ellipsoid method, as shown by Codenotti et al. (2005), to compute a 
weak (1 + e€)-approximate equilibrium in polynomial time. That is, the dependence of 
the running time on i can be made polynomial in log i 

The simple algorithm of Section 6.3.1, which is a discrete version of the tatonnement 
process, is introduced and analyzed in Codenotti et al. (2005). 

The convex feasibility program of Section 6.4.1 is due to Nenakov and Primak (1983) 
and Jain (2004). For linear utilities, an equilibrium price vector whose components are 
small rational numbers exists. Jain (2004) proposes a variant of the Ellipsoid algorithm 
that, exploiting this, uses the separation hyperplane implied by the convex program to 
compute the equilibrium exactly in polynomial time. Ye (in press) presents an efficient 
interior-point algorithm that computes the exact equilibrium in polynomial time. The 
convex program of Section 6.4.2 has been introduced in Codenotti et al. (2005). 

Section 6.5.1 describes a market with two traders and two goods that has multiple 
disconnected equilibria. Such example has been proposed by Gjerstad (1996). 

The class PPAD introduced in Section 6.5.2 was defined by Papadimitriou (1994). 
The game-market correspondence was shown in Codenotti et al. (2006). The PPAD 
completeness of the computation of a Nash equilibrium for a bimatrix game is due 
to Chen and Deng (2005b). Chen and Deng’s result came after a sequence of works, 
where first the PPA D-completeness of 4-player games (Daskalakis et al., 2005), and 
then of 3-player games (Chen and Deng, 2005a; Daskalakis and Papadimitriou, 2005) 
were proven. 

The convex program of Section 6.6 has been introduced in Jain and Varadarajan 
(2006). We have not mentioned several other results on convex programs for production 
models. We refer the interested reader to Jain and Varadarajan (2006) and the references 
therein. 
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Exercises 


6.1 Use the Karush-Kuhn-Tucker conditions to derive an explicit expression for the de- 
mand of a consumer with a Cobb-Douglas utility function. Also derive formula 6.7, 
the expression for the demand with a CES function. 


6.2 Show that for an exchange economy with Cobb-Douglas utility functions, the pos- 
itive equilirbium prices can be characterized as solutions to a linear feasibility 
program with variables for the prices. The number of constraints of the program 
must be polynomial in the number of traders and goods. 

6.3. Prove that Lemma 6.4 implies that the set of equilibrium prices is convex. 

6.4 Prove parts (2), (3), and (4) of Lemma 6.5. 


6.5 Suppose that z and 7 are two price vectors such that max; z < (1 +¢/3)min; Be 


and z is an equilibrium. Show that z is a weak (1 + ¢)-approximate equilibrium. 


CHAPTER 7 


Graphical Games 


Michael Kearns 


Abstract 


In this chapter we examine the representational and algorithmic aspects of a class of graph-theoretic 
models for multiplayer games. Known broadly as graphical games, these models specify restric- 
tions on the direct payoff influences among the player population. In addition to a number of nice 
computational properties, these models have close connections to well-studied graphical models for 
probabilistic inference in machine learning and statistics. 


7.1 Introduction 


Representing multiplayer games with large player populations in the normal form 
is undesirable for both practical and conceptual reasons. On the practical side, the 
number of parameters that must be specified grows exponentially with the size of the 
population. On the conceptual side, the normal form may fail to capture structure that 
is present in the strategic interaction, and which can aid understanding of the game 
and computation of its equilibria. For this reason, there have been many proposals for 
parametric multiplayer game representations that are more succinct than the normal 
form, and attempt to model naturally arising structural properties. Examples include 
congestion and potential games and related models (Monderer and Shapley, 1996; 
Rosenthal, 1973). 

Graphical games are a representation of multiplayer games meant to capture and 
exploit locality or sparsity of direct influences. They are most appropriate for large 
population games in which the payoffs of each player are determined by the actions 
of only a small subpopulation. As such, they form a natural counterpart to earlier 
parametric models. Whereas congestion games and related models implicitly assume 
a large number of weak influences on each player, graphical games are suitable when 
there is a small number of strong influences. 

Graphical games adopt a simple graph-theoretic model. A graphical game is de- 
scribed at the first level by an undirected graph G in which players are identified with 
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vertices. The semantics of the graph are that a player or vertex i has payoffs that are 
entirely specified by the actions of i and those of its neighbor set in G. Thus G alone 
may already specify strong qualitative constraints or structure over the direct strategic 
influences in the game. To fully describe a graphical game, we must additionally spec- 
ify the numerical payoff functions to each player — but now the payoff to player i is a 
function only of the actions of i and its neighbors, rather than the actions of the entire 
population. In the many natural settings where such local neighborhoods are much 
smaller than the overall population size, the benefits of this parametric specification 
over the normal form are already considerable. 

But several years of research on graphical games has demonstrated that the advan- 
tages of this model extend well beyond simple parsimony — rather, they are compu- 
tational, structural, and interdisciplinary as well. We now overview each of these in 
turn. 

Computational. Theoretical computer science has repeatedly established that strong 
but naturally occurring constraints on optimization and other problems can be exploited 
algorithmically, and game theory is no exception. Graphical games provide a rich 
language in which to state and explore the computational benefits of various restrictions 
on the interactions in a large-population game. As we shall see, one fruitful line of 
research has investigated topological restrictions on the underlying graph G that yield 
efficient algorithms for various equilibrium computations. 

Structural. In addition to algorithmic insights, graphical games also provide a pow- 
erful framework in which to examine the relationships between the network structure 
and strategic outcomes. Of particular interest is whether and when the local interactions 
specified by the graph G alone (i.e., the topology of G, regardless of the numerical 
specifications of the payoffs) imply nontrivial structural properties of equilibria. We 
will examine an instance of this phenomenon in some detail. 

Interdisciplinary. Part of the original motivation for graphical games came from 
earlier models familiar to the machine learning, AI and statistics communities — collec- 
tively known as graphical models for probabilistic inference, which include Bayesian 
networks, Markov networks, and their variants. Broadly speaking, both graphical mod- 
els for inference and graphical games represent complex interactions between a large 
number of variables (random variables in one case, the actions of players in a game in 
the other) by a graph combined with numerical specification of the interaction details. 
In probabilistic inference the interactions are stochastic, whereas in graphical games 
they are strategic (best response). As we shall discuss, the connections to probabilis- 
tic inference have led to a number of algorithmic and representational benefits for 
graphical games. 

In this chapter we will overview graphical games and the research on them to 
date. We will center our discussion around two main technical results that will be 
examined in some detail, and are chosen to illustrate the computational, structural, and 
interdisciplinary benefits discussed above. These two case studies will also serve as 
natural vehicles to survey the broader body of literature on graphical games. 

The first problem we shall examine is the computation of Nash equilibria in graphical 
games in which the underlying graph G is a tree (or certain generalizations of trees). 
Here we will discuss a natural two-pass algorithm for computing Nash equilibria 
requiring only the local exchange of “conditional equilibrium” information over the 
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edges of G. This algorithm comes in two variations — one that runs in time polynomial 
in the representation size of the graphical game and computes (a compact representation 
of) approximations of all Nash equilibria, and another that runs in exponential time but 
computes (a compact representation of) all Nash equilibria exactly. We will discuss 
a number of generalizations of this algorithm, including one known as NashProp, 
which has close ties to the well-known belief propagation algorithm in probabilistic 
inference. Together these algorithms provide examples of the algorithmic exploitation 
of structural restrictions on the graph. 

The second problem we shall examine is the representation and computation of the 
correlated equilibria of a graphical game. Here we will see that there is a satisfying 
and natural connection between graphical games and the probabilistic models known 
as Markov networks, which can succinctly represent high-dimensional multivariate 
probability distributions. More specifically, we shall show that any graphical game with 
graph G can have all of its correlated equilibria (up to payoff equivalence) represented 
by a Markov network with the same network structure. If we adopt the common view of 
correlated equilibria as permitting “shared” or “public” randomization (the source of the 
correlations) — whereas Nash equilibria permit only “private” randomization or mixed 
strategies — this result implies that the shared randomization can actually be distributed 
locally throughout the graph, and that distant parties need not be (directly) correlated. 
From the rich tools developed for independence analysis in Markov networks, it also 
provides a compact representation of a large number of independence relationships 
between player actions that may be assumed at (correlated) equilibrium. The result 
thus provides a good example of a direct connection between graph structure and 
equilibrium properties, as well as establishing further ties to probabilistic inference. 
We shall also discuss the algorithmic benefits of this result. 

After studying these two problems in some detail, we will briefly overview recent 
research incorporating network structure into other game-theoretic and economic set- 
tings, such as exchange economies (Arrow-Debreu, Fischer and related models). Again 
the emphasis will be on computational aspects of these models, and on the relationship 
between graph structure and equilibrium properties. 


7.2 Preliminaries 


In this section we shall provide formal definitions for graphical games, along with 
other needed definitions, terminology, and notation. We begin with notions standard to 
classical multiplayer game theory. 

A multiplayer game consists of n players, each with a finite set of pure strategies 
or actions available to them, along with a specification of the payoffs to each player. 
Throughout the chapter, we use a; to denote the action chosen by player i. For simplicity 
we will assume a binary action space, so a; € {0, 1}. (The generalization of the results 
examined here to the multiaction setting is straightforward.) The payoffs to player i 
are given by a table or matrix Mj, indexed by the joint action a € {0, 1}". The value 
M,(a), which we assume without loss of generality to lie in the interval [0, 1], is the 
payoff to player i resulting from the joint action a. Multiplayer games described in this 
way are referred to as normal form games. 
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The actions 0 and 1 are the pure strategies of each player, while a mixed strategy 
for player i is given by the probability p; € [0, 1] that the player will play 0. For 
any joint mixed strategy, given by a product distribution p, we define the expected 
payoff to player i as M;(p) = Ez~;[M;(a)], where a ~ p indicates that each a; is 0 
with probability p; and 1 with probability 1 — p; independently. When we introduce 
correlated equilibria below, we shall allow the possibility that the distribution over a is 
not a product distribution, but has correlations between the q;. 

We use pi : p;| to denote the vector (product distribution) which is the same as 
p except in the ith component, where the value has been changed to p}. A Nash 
equilibrium (NE) for the game is a mixed strategy p such that for any player i, and for 
any value p! € [0, 1], M;(p) > M;(pli : p/]). (We say that p; is a best response to the 
rest of p.) In other words, no player can improve their expected payoff by deviating 
unilaterally from an NE. The classic theorem of Nash (1951) states that for any game, 
there exists an NE in the space of joint mixed strategies. 

We will also use a straightforward (additive) definition for approximate Nash equi- 
libria. An €-Nash equilibrium is a mixed strategy p such that for any player i, and for 
any value p’ € [0, 1], Mi(p) + « => M;(pli : p/]). (We say that p; is an €-best response 
to the rest of p.) Thus, no player can improve their expected payoff by more than € by 
deviating unilaterally from an approximate NE. 

We are now ready to introduce the graphical game model. In a graphical game, each 
player i is represented by a vertex in an undirected graph G. We use N(i) C {1,..., 7} 
to denote the neighborhood of player i in G — that is, those vertices j such that the 
edge (i, j) appears in G. By convention N(i) always includes i itself as well. If a is a 
joint action, we use a‘ to denote the projection of a onto just the players in N(i). 


Definition 7.1. A graphical game is a pair (G, M), where G is an undirected 
graph over the vertices {1,...,}, and M isa set of n local game matrices. For 
any joint action a, the local game matrix M; € M specifies the payoff M;(a') for 
player i, which depends only on the actions taken by the players in N(Z). 


Remarks. Graphical games are a (potentially) more compact way of representing 
games than standard normal form. In particular, rather than requiring a number of 
parameters that is exponential in the number of players n, a graphical game requires 
a number of parameters that is exponential only in the size d of the largest local 
neighborhood. Thus if d < n — that is, the number of direct influences on any player 
is much smaller than the overall population size — the graphical game representation is 
dramatically smaller than the normal form. Note that we can represent any normal form 
game as a graphical game by letting G be the complete graph, but the representation 
is only useful when a considerably sparser graph can be found. It is also worth noting 
that although the payoffs to player i are determined only by the actions of the players 
in N(i), equilibrium still requires global coordination across the player population — if 
player i is connected to player j who is in turn connected to player k, then i and 
k indirectly influence each other via their mutual influence on the payoff of 7. How 
local influences propagate to determine global equilibrium outcomes is one of the 
computational challenges posed by graphical games. 

In addition to Nash equilibrium, we will also examine graphical games in the context 
of correlated equilibria (CE). CE (Aumann, 1974) generalize NE, and can be viewed as 
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(possibly arbitrary) distributions P(a) over joint actions satisfying a certain conditional 
expectation property. 

The intuition behind CE can be described as follows. Imagine that there is a trusted 
party that faithfully draws a joint action a according to distribution P, and distributes 
to each player i only their private component a;. If P is a product distribution, as in 
the NE case, then due to the independence between all players the revelation of a; does 
not condition player i’s beliefs over the play of others. For general P, however, this is 
not true. The CE condition asks that the expected payoff to i if he is “obedient” and 
plays a; be at least as great the amount 7 could earn by “cheating” and deviating to 
play a different action. In other words, in Bayesian terms, despite the observation of a; 
updating the posterior distribution over the other player actions from i’s perspective, 
it is still payoff-optimal for i to play a;. This leads to the formal definition below, in 
which for any given joint distribution P(a@) over player actions and b € {0, 1}, we let 
P,,=» denote the distribution on a conditioned on the event that a; = b. 


Definition 7.2. A correlated equilibrium (CE) for a two-action normal form 
game is a distribution P(@) over actions satisfying 


Vi € {1,...,n}, Vb € {0, 1}: Ex~p,_,[Mi(a)] > Ei~p,, [M;(ali : =b])] 


=b 


The expectation Ez~ P,,-» [Mi (@)] is over those cases in which the value a; = Db is 
revealed to player i, who proceeds to “honestly” play a; = b. The expectation 
Kz~ P,, -» | Mi (ali : ab])] is over the same cases, but now player i unilaterally devi- 
ates to play a; = —b, whereas the other players faithfully play from the conditional 
distribution P,,—». It is straightforward to generalize this definition to the multiaction 
case — again, we demand that it be optimal for each player to take the action provided 
by the trusted party, despite the conditioning information revealed by this action. 


Remarks. CE offers a number of conceptual and computational advantages over 
NE, including the facts that new and sometimes more “fair” payoffs can be achieved, 
that CE can be computed efficiently for games in standard normal form (though recall 
that “efficiently” here means exponential in the number of players, an issue we shall 
address), and that CE are the convergence notion for several natural “no-regret” learning 
algorithms (Foster and Vohra, 1999). Furthermore, it has been argued that CE is the 
natural equilibrium concept consistent with the Bayesian perspective (Aumann, 1987; 
Foster and Vohra, 1997). One of the most interesting aspects of CE is that they broaden 
the set of “rational” solutions for normal form games without the need to address often 
difficult issues such as stability of coalitions and payoff imputations (Aumann, 1987). 
The traffic signal is often cited as an informal everyday example of CE, in which a 
single bit of shared information allows a fair split of waiting times (Owen, 1995). In 
this example, no player stands to gain greater payoff by unilaterally deviating from 
the correlated play, for instance by “running a light.” This example also illustrates a 
common alternative view of CE, in which correlations arise as a result of “public” or 
“shared” random bits (in addition to the “private” random bits allowed in the standard 
mixed strategies or product distributions of NE). Here the state of the traffic light itself 
(which can be viewed as a binary random variable, alternately displayed as red and 
green to orthogonal streets) provides the shared randomization. 
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7.3 Computing Nash Equilibria in Tree Graphical Games 


In this section, we describe the first and perhaps most basic algorithm exploiting the ad- 
vantages of graphical game representation for the purposes of equilibrium computation. 
The case considered is that in which the underlying graph G is a tree. While obviously a 
strong restriction on the topology, we shall see that this case already presents nontrivial 
computational challenges, which in turn force the development of algorithmic tools 
that can be generalized beyond trees to obtain a more general heuristic known as. 

NashProp. We first describe the algorithm TreeNash at a high level, leaving certain 
important implementation details unspecified, because it is conceptually advantageous 
to do so. We then describe two instantiations of the missing details — yielding one 
algorithm that runs in polynomial time and provably computes approximations of all 
equilibria, and another algorithm that runs in exponential time and provably computes 
all exact equilibria. 

We begin with some notation and concepts needed for the description of TreeNash. 
In order to distinguish parents from children in the tree, it will be convenient to treat 
players/vertices symbolically (such as U, V, and W) rather than by integer indices, so 
we use My to denote the local game matrix for the player identified with player/vertex 
V. We use capital letters to denote vertex/players to distinguish them from their chosen 
actions, for which we shall use lower case. If G is a tree, we choose an arbitrary vertex 
as the root (which we visualize as being at the bottom, with the leaves at the top). Any 
vertex on the path from a vertex V to the root will be called downstream from V, and 
any vertex on a path from V to any leaf will be called upstream from V. Thus, each 
vertex other than the root has exactly one downstream neighbor (or child), and perhaps 
many upstream neighbors (or parents). We use UPg(V) to denote the set of all vertices 
in G that are upstream from V, including V by definition. 

Suppose that V is the child of U in G. We let GY denote the subgraph induced by 
the vertices in UPG(U) — that is, the subtree of G rooted at U. If v € [0, 1] is a mixed 
strategy for player (vertex) V, MY _, will denote the subset of payoff matrices in M 
corresponding to the vertices in UPg(U), with the modification that the game matrix 
Mz is collapsed by one index by fixing V = v. We can think of an NE for the graphical 
game (GY, ME) as a conditional equilibrium “upstream” from U (inclusive) — that 
is, an equilibrium for GY given that V plays v. Here we are simply exploiting the fact 
that since G is a tree, fixing a mixed strategy v for the play of V isolates GY from the 
rest of G. 

Now suppose that vertex V has k parents U;,..., U,, and the single child W. We 
now describe the data structures sent from each U; to V, and in turn from V to W, 
on the downstream pass of TreeNash. Each parent U; will send to V a binary-valued 
“table” T(v, u;). The table is indexed by the continuum of possible values for the 
mixed strategies v € [0,1] of V and uw; € [0,1] of U;, i =1,...,k. The semantics 
of this table will be as follows: for any pair (v, u;), T(v, u;) will be 1 if and only if 
there exists an NE for (GU', M a) in which U; = u;. Note that we will slightly abuse 
notation by letting T(v, u;) refer to both the entire table sent from U; to V, and the 
particular value associated with the pair (v, u;), but the meaning will be clear from the 
context. 
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Algorithm TreeNash 
Inputs: Graphical game (G, M) in which G is a tree. 
Output: A Nash equilibrium for (G, M). 


(i) Compute a depth-first ordering of the vertices of G. 
(ii) (Downstream Pass) For each vertex V in depth-first order: 
(a) Let vertex W be the child of V (or nil if V is the root). 
(b) For all w, v € [0, 1], initialize T(w, v) to be O and the witness list for 
T(w, v) to be empty. 
(c) If V is a leaf (base case): 
1. For all w, v € [0, 1], set T(w, v) to be 1 if and only if V = v is a best 
response to W = w (as determined by the local game matrix My). 


(d) Else (inductive case, V is an internal vertex): 


1. Let U = (U;,..., Ux) be the parents of V; let T(v,u;) be the table 
passed from U; to V on the downstream pass. 


2. For all w, v € [0, 1] and for all joint mixed strategies u = (uy, ..., Ux) 
for U: If V = v is a best response to W = w, U = ui (as determined 
by the local game matrix My), and T(v,u;) = 1 fori = 1,--- ,k, set 


T(w, v) to be 1 and add u to the witness list for T(w, v). 
(e) Pass the table T(w, v) from V to W. 


(iii) (Upstream Pass) For each vertex V in reverse depth-first ordering (starting at 
the root): 


(a) Let U= (U,,..., Ux) be the parents of V (or the empty list if V is a leaf); 
let W be the child of V (or nil if V is the root), and (w, v) the values passed 
from W to V on the upstream pass. 

(b) Label V with the value v. 

(c) (Non-deterministically) Choose any witness u to T(w, v) = 1. 

(d) Fori = 1,...,k, pass (v, u;) from V to Uj. 


Figure 7.1. Algorithm TreeNash for computing NE of tree graphical games. 


Since v and u; are continuous variables, it is not obvious that the table T(v, u;) can 
be represented compactly, or even finitely, for arbitrary vertices in a tree. For now we 
will simply assume a finite representation, and shortly discuss how this assumption can 
be met in two different ways. 

The initialization of the downstream pass of the algorithm begins at the leaves of 
the tree, where the computation of the tables is straightforward. If U is a leaf and V its 
only child, then 7(v, uw) = 1 if and only if U = wis a best response to V = v (Step (ii) 
(c) of Figure 7.1). 

Assuming for induction that each U; sends the table T(v, u;) to V, we now describe 
how V can compute the table T(w, v) to pass to its child W (Step (ii) (d)2 of Figure 7.1). 
For each pair (w, v), T(w, v) is set to 1 if and only if there exists a vector of mixed strate- 
giesu = (uy, ..., ux) (called a witness) for the parents = (U,,..., Ux) of V such that 
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@ Tv, u;) = 1 forall 1 <i <k; and 
(ii) V =v isa best response to U =u, W = w. 


Note that there may be more than one witness for T(w, v) = 1. In addition to 
computing the value T(w, v) on the downstream pass of the algorithm, V will also 
keep a list of the witnesses u for each pair (w, v) for which T(w, v) = 1 (Step ii(d)2 
of Figure 7.1). These witness lists will be used on the upstream pass. 

To see that the semantics of the tables are preserved by the computation just de- 
scribed, suppose that this computation yields T(w, v) = 1 for some pair (w, v), and let a 
be a witness for T(w, v) = 1. The fact that T(v, u;) = 1 for all i (condition (7.3) above) 
ensures by induction that if V plays v, there are upstream NE in which each U; = uj. 
Furthermore, v is a best response to the local settings U; = u1,..., Up, = uz, W = w' 
(condition (7.3) above). Therefore, we are in equilibrium upstream from V. On the 
other hand, if T(w, v) = 0, it is easy to see there can be no equilibrium in which 
W =u, V = v. Note that the existence of an NE guarantees that T(w, v) = 1 for at 
least one (w, v) pair. 

The downstream pass of the algorithm terminates at the root Z, which receives 
tables T(z, y;) from each parent Y;. Z simply computes a one-dimensional table T(z) 
such that T(z) = 1 if and only if for some witness y, T(z, y;) = 1 for alli, and zis a 
best response to y. 

The upstream pass begins by Z choosing any z for which T(z) = 1, choosing any 
witness (y;,..., yx) to T(z) = 1, and then passing both z and y; to each parent Y;. 
The interpretation is that Z will play z, and is “instructing” Y; to play y;. Inductively, 
if a vertex V receives a value v to play from its downstream neighbor W, and the 
value w that W will play, then it must be that T(w, v) = 1. So V chooses a witness 
u to T(w, v) = 1, and passes each parent U; their value u; as well as v (Step (iii) 
of Figure 7.1). Note that the semantics of T(w,v) = 1 ensure that V = v is a best 
response to U-i,W=vw. 

We have left the choices of each witness in the upstream pass unspecified or non- 
deterministic to emphasize that the tables and witness lists computed represent all the 
NE. The upstream pass can be specialized to find a number of specific NE of interest, 
including player optimum (NE maximizing expected reward to a chosen player), social 
optimum (NE maximizing total expected reward, summed over all players), and wel- 
fare optimum (NE maximizing expected reward to the player whose expected reward 
is smallest). 

Modulo the important details regarding the representation of the tables T(w, v), 
which we discuss next, the arguments provided above establish the following formal 
result. 


Theorem 7.3 Let (G, M) be any graphical game in which G is a tree. Algorithm 
TreeNash computes a Nash equilibrium for (G, M). Furthermore, the tables and 
witness lists computed by the algorithm represent all Nash equilibria of (G, M). 


7.3.1 An Approximation Algorithm 


In this section, we sketch one instantiation of the missing details of algorithm TreeNash 
that yields a polynomial-time algorithm for computing approximate NE for the tree 
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game (G, M). The approximation can be made arbitrarily precise with greater compu- 
tational effort. 

Rather than playing an arbitrary mixed strategy in [0, 1], each player will be con- 
strained to play a discretized mixed strategy that is a multiple of t, for some T to be 
determined by the analysis. Thus, player i plays qg; € {0,7,2t,..., 1}, and the joint 
strategy q falls on the discretized t-grid {0, t, 27, ..., 1}". In algorithm TreeNash, 
this will allow each table T(v, u) (passed from vertex U to child V) to be represented 
in discretized form as well: only the 1/17 entries corresponding to the possible t-grid 
choices for U and V are stored, and all computations of best responses in the algorithm 
are modified to be approximate best responses. 

To quantify how the choice of t will influence the quality of the approximate 
equilibria found (which in turn will determine the computational efficiency of the 
approximation algorithm), we appeal to the following lemma. We note that this result 
holds for arbitrary graphical games, not only trees. 


Lemma 7.4 Let G be a graph of maximum degree d, and let (G, M) be a 
graphical game. Let p be a Nash equilibrium for (G, M), and let q be the nearest 
(in Ly metric) mixed strategy on the t-grid. Then q is a dt-NE for (G, M). 


The proof of Lemma 7.4, which we omit, follows from a bound on the L; distance 
for product distributions along with an argument that the strategic properties of the 
NE are preserved by the approximation. We note that the original paper (Kearns et al., 
2001) used a considerably worse L; bound that was exponential in d. However, the 
algorithm remains exponential in d simply due to the representational complexity of 
the local product distributions. The important point is that t needs to depend only on 
the local neighborhood size d, not the total number of players n. 

It is now straightforward to describe ApproximateTreeNash. This algorithm is 
identical to algorithm TreeNash with the following exceptions: 


¢ The algorithm now takes an additional input e. 

¢ For any vertex U with child V, the table T(u, v) will contain only entries for u and v 
multiples of t. 

e All computations of best responses in algorithm TreeNash become computations of 
€-best responses in algorithm ApproximateTreeNash. 


For the running time analysis, we simply note that each table has (1/t)* entries, 
and that the computation is dominated by the downstream calculation of the tables 
(Step (ii)(d) of algorithm TreeNash). This requires ranging over all table entries for all 
k parents, a computation of order ((1/t)*)*. By appropriately choosing the value of t 
in order to obtain the required €-approximations, we obtain the following theorem. 


Theorem 7.5 Let (G,.M) be a graphical game in which G is a tree with n 
vertices, and in which every vertex has at most d parents. For any € > 0, let 
t = O(€/d). Then ApproximateTreeNash computes an €-Nash equilibrium for 
(G, M). Furthermore, for every exact Nash equilibrium, the tables and witness 
lists computed by the algorithm contain an €-Nash equilibrium that is within 
t of this exact equilibrium (in Ly norm). The running time of the algorithm is 
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polynomial in 1/€,n and 24, which is polynomial in the size of the representation 


(G, M). 


7.3.2 An Exact Algorithm 


By approximating the continuously indexed tables T(u,v) in discretized form, 
the algorithm developed in Section 7.3.1 side-stepped not only the exact com- 
putation but also a fundamental question about the T(u,v) — namely, do the 
regions {(u,v) € [0,1]* : T(u, v) = 1} have any interesting geometric structure? 
It turns out the answer in the case of trees is affirmative, and can be used 
in developing an alternate instantiation of the general TreeNash algorithm of 
Section 7.3 — one that yields an algorithm for computing (all) exact equilibria, but 
in time that is exponential in the number of players n rather than only the degree d. 

Although the details are beyond our scope, it is possible to show via an inductive 
argument (where again the leaves of G serve as the base cases) that in any tree graphical 
game, for any of the tables T(u, v) defined by TreeNash, the region {(u, v) € [0, 1]*: 
T(u, v) = 1} can be represented by a finite union of (axis-aligned) rectangular regions 
in [0, 1]° (i.e., regions that are defined as products of closed intervals [a, a’] x [b, b’] for 
some 0 < a <a’ < 1,0 <b <D’' < 1). The induction shows that the number of such 
regions multiplies at each level as we progress downstream toward the root, yielding a 
worst-case bound on the number of rectangular regions that is exponential in n. 

This simple (if exponential in m) geometric representation of the tables T(u, v) 
permits the development of an alternative algorithm ExactTreeNash, which is simply 
the abstract algorithm TreeNash with the tables represented by unions of rectangles 
(and with associated implementations of the necessary upstream and downstream 
computations). 


Theorem 7.6 = There is an algorithm ExactTreeNash that computes an exact 
Nash equilibrium for any tree graphical game (G, M). Furthermore, the tables 
computed by the algorithm represent all Nash equilibria of (G, M). The algorithm 
runs in time exponential in the number of vertices of G. 


7.3.3, Extensions: NashProp and Beyond 


At this point it is of course natural to ask what can be done when the underlying graph of 
a graphical game is not a tree. Remaining close to the development so far, it is possible 
to give an heuristic generalization of algorithm ApproximateTreeNash to the setting 
in which the graph G is arbitrary. This algorithm is known as NashProp, which we will 
now briefly sketch. By heuristic we mean that the algorithm is well-defined and will 
terminate on any graphical game; but unlike ApproximateTreeNash, the running time 
is not guaranteed to be polynomial in the size of the input graphical game. (In general, 
we should expect provably efficient algorithms for equilibrium computation to require 
some topological restriction, since allowing G to be the complete graph reduces to the 
classical normal form representation.) 

Recall that the main computation at vertex V in ApproximateTreeNash was the 
computation of the downstream table T(w, v) from the upstream tables T(v, u;). This 


GRAPHICAL GAMES AND CORRELATED EQUILIBRIA 169 


assumed an underlying orientation to the tree that allowed V to know which of its neigh- 
bors were in the direction of the leaves (identified as the U;) and which single neighbor 
was in the direction of the root (identified as W). The easiest way to describe NashProp 
informally is to say that each V does this computation once for each of its neighbors, 
each time “pretending” that this neighbor plays the role of the downstream neighbor W 
in ApproximateTreeNash, and the remaining neighbors play the roles of the upstream 
U;. If all discretized table entries are initialized to the value of 1,' it easy to show that 
the only possible effect of these local computations is to change table values from 1 
to 0, which in effect refutes conditional NE assertions when they violate best-response 
conditions. In other words, the tables will all converge (and in fact, in time polynomial 
in the size of the graphical game) — however, unlike in ApproximateTreeNash, the 
tables do not represent the set of all approximate NE, but a superset. This necessitates a 
second phase to the algorithm that employs more traditional search heuristics in order 
to find a true equilibrium, and it is this second phase that may be computationally 
expensive. 

One of the merits of NashProp is that the first (table computation) phase can be 
viewed as an instance of constraint satisfaction programming (CSP), which in turn 
plays an important role in many algorithms for probabilistic inference in Bayesian 
networks, Markov networks, and related models. The NashProp algorithm was also 
inspired by, and bears a fair similarity to, the well-known belief propagation algorithm 
for Bayesian network inference. We shall see other connections to these models arise 
in our examination of correlated equilibria in graphical games, which we turn to now. 


7.4 Graphical Games and Correlated Equilibria 


Our second case study is an examination of graphical games and correlated equilibrium. 
As has already been noted, if we are fortunate enough to be able to accurately represent 
a multiplayer game we are interested in as a graphical game with small degree, the 
representational benefits purely in terms of parameter reduction may be significant. 
But this is still a rather cosmetic kind of parsimony. We shall see a much deeper variety 
in the context of correlated equilibrium. 

The first issue that arises in this investigation is the problem of representing corre- 
lated equilibria. Recall that NE may be viewed as a special case of CE in which the 
distribution P(a) is a product distribution. Thus, however computationally difficult it 
may be to find an NE, at least the object itself can be succinctly represented — it is simply 
a mixed strategy profile p, whose length equals the number of players n. Despite their 
aforementioned advantages, in moving to CE we open a representational Pandora’s 
Box, since even in very simple graphical games there may be correlated equilibria of 
essentially arbitrary complexity. For example, the CE of a game always include all 
mixture distributions of NE, so any game with an exponential number of NE can yield 
extremely complex CE. Such games can be easily constructed with very simple graphs. 


'Note that in the description of TreeNash in Figure 7.1 it was more convenient to initialize the table values to 0, 
but the change is cosmetic. 
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More generally, whereas by definition in an NE all players are independent, in a CE 
there may be arbitrary high-order correlations. 

In order to maintain the succinctness of graphical games, some way of addressing 
this distributional complexity is required. For this we turn to another, older graph- 
theoretic formalism — namely, undirected graphical models for probabilistic inference, 
also known as Markov networks (Lauritzen, 1996). We will establish a natural and 
powerful relationship between a graphical game and a certain associated Markov 
network. Like the graphical game, the associated Markov network is a graph over the 
players. While the interactions between vertices in the graphical game are entirely 
strategic and given by local payoff matrices, the interactions in the associated Markov 
network are entirely probabilistic and given by local potential functions. The graph of 
the associated Markov network retains the parsimony of the graphical game. 

We will show that the associated Markov network is sufficient for representing 
any correlated equilibria of the graphical game (up to expected payoff equivalence). 
In other words, the fact that a multiplayer game can be succinctly represented by a 
graph implies that its entire space of CE outcomes can be represented graphically 
with comparable succinctness. This result establishes a natural relationship between 
graphical games and modern probabilistic modeling. We will also briefly discuss the 
computational benefits of this relationship. 


7.4.1 Expected Payoff and Local Neighborhood Equivalence 


Our effort to succinctly model the CE of a graphical game consists of two steps. 
In the first step, we argue that it is not necessary to model a// the correlations that 
might arise in a CE, but only those required to represent all of the possible (expected 
payoff) outcomes for the players. In the second step, we show that the remaining 
correlations can be represented by a Markov network. For these two steps we re- 
spectively require two equivalence notions for distributions — expected payoff equiv- 
alence and local neighborhood equivalence. We shall show that there is a natural 
subclass of the set of all CE of a graphical game, based on expected payoff equiv- 
alence, whose representation size is linearly related to the representation size of the 
graphical game. 


Definition 7.7 Two distributions P and Q over joint actions a are expected 
payoff equivalent, denoted P =gp Q, if P and Q yield the same expected payoff 
vector: for each i, Ez~ p[M;(a)] = Ez~o[M;(a)]. 


Note that merely finding distributions giving the same payoffs as the CE is not espe- 
cially interesting unless those distributions are themselves CE — we want to preserve the 
strategic properties of the CE, not only its payoffs. Our primary tool for accomplishing 
this goal will be the notion of local neighborhood equivalence, or the preservation of 
local marginal distributions. Below we establish that local neighborhood equivalence 
both implies payoff equivalence and preserves the CE property. In the following sub- 
section, we describe how to represent this natural subclass in a certain Markov network 
whose structure is closely related to the structure of the graphical game. 
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Expected payoff equivalence of two distributions is, in general, dependent upon the 
reward matrices of a graphical game. Let us consider the following (more stringent) 
equivalence notion, which is based only on the graph G of a game. 


Definition 7.8 For a graph G, two distributions P and Q over joint actions a 
are local neighborhood equivalent with respect to G, denoted P =n Q, if for all 
players i, and for all settings a‘ of N(i), P(a') = Q(@'). 


In other words, the marginal distributions over all local neighborhoods defined by 
G are identical. Since the graph is always clear from context, we shall just write 
P =n Q. The following lemma establishes that local neighborhood equivalence is 
indeed a stronger notion of equivalence than expected payoff. 


Lemma 7.9 For all graphs G, for all joint distributions P and Q on actions, and 
for all graphical games with graph G, if P =,n Q then P =gp Q. Furthermore, 
for any graph G and joint distributions P and Q, there exist payoff matrices M 
such that for the graphical game (G, M), if P Arn Q then P Hep Q. 


PROOF The first statement follows from the observation that the expected payoff 
to player i depends only on the marginal distribution of actions in N(i). To prove 
the second statement, if P #_n Q, then there must exist a player i and a joint 
action a ' for its local neighborhood which has a different probability under P and 
Q. Simply set M;(a') = 1 and M; = 0 elsewhere. Then i has a different payoff 
under P and Q, and so P #gp Q. 


Thus local neighborhood equivalence implies payoff equivalence, but the converse 
is not true in general (although there exists some payoff matrices where the converse 
is correct). We now establish that local neighborhood equivalence also preserves CE. 
It is important to note that this result does not hold for expected payoff equivalence. 


Lemma 7.10 For any graphical game (G, M), if P is a CE for (G, M) and 
P =,y Q then Q is aCE for (G, M). 


PROOF The lemma follows by noting that the CE expectation equations are 


dependent only upon the marginal distributions of local neighborhoods, which 
are preserved in Q. 


While explicitly representing al/ CE is infeasible even in simple graphical games, 
we next show that we can concisely represent, in a single model, all CE up to 
local neighborhood (and therefore payoff) equivalence. The amount of space re- 
quired is comparable to that required to represent the graphical game itself, and al- 


lows us to explore or enumerate the different outcomes achievable in the space of 
CE. 
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7.4.2 Correlated Equilibria and Markov Nets 


In the same way that graphical games provide a concise language for expressing local 
interaction in game theory, Markov networks exploit undirected graphs for expressing 
local interaction in probability distributions. It turns out that (a special case of) Markov 
networks are a natural and powerful language for expressing the CE of a graphical 
game, and that there is a close relationship between the graph of the game and its 
associated Markov network graph. We begin with the necessary definitions. 


Definition 7.11 A local Markov network is a pair M = (G, V), where 
¢ Gis an undirected graph on vertices {1,..., 7}; 


¢ W isa set of potential functions, one for each local neighborhood N(i), mapping 
binary assignments of values of N(i) to the range [0, 00) : 


W = {Wj :i=1,...,n: 2 {a} > [0, oo)}, 


where {a ‘} is the set of all 2!" settings to N(i). 


A local Markov network M defines a probability distribution Py as follows. For 
any binary assignment a to the vertices, we define 


ees meee 
Pu) = (1 Wil ») 5 
i=l 


where Z = )°. []}_, Wi(a') > 0 is the normalization factor. 

Note that any joint distribution can be represented as a local Markov network on 
a sufficiently dense graph: if we let G be the complete graph then we simply have a 
single potential function over the entire joint action space a. However, if d is the size 
of the maximal neighborhood in G, then the representation size of a distribution in this 
network is O(n2¢), a considerable savings over a tabular representation if d <n. 

Local Markov networks are a special case of Markov networks, a well-studied 
probabilistic model in AI and statistics (Lauritzen, 1996; Pearl, 1988). A Markov 
network is typically defined with potential functions ranging over settings of maximal 
cliques in the graph, rather than local neighborhoods. Another approach we could have 
taken is to transform the graph G to a graph G’, which forms cliques of the local 
neighborhoods N(i), and then used standard Markov networks over G’ as opposed to 
local Markov networks over G. However, this can sometimes result in an unnecessary 
exponential blow-up of the size of the model when the resulting maximal cliques are 
much larger than the original neighborhoods. For our purposes, it is sufficient to define 
the potential functions over just local neighborhoods (as in our definition) rather than 
maximal cliques in G’, which avoids this potential blow-up. 

The following technical lemma establishes that a local Markov network always 
suffices to represent a distribution up to local neighborhood equivalence. 


Lemma 7.12 For all graphs G, and for all joint distributions P over joint 
actions, there exists a distribution Q that is representable as a local Markov 
network with graph G such that Q =n P with respect to G. 
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PROOF The objective is to find a single distribution Q that is consistent with 
the players’ local neighborhood marginals under P and is also a Markov net- 
work with graph G. For this we shall sketch the application of methods from 
probabilistic inference and maximum entropy models to show that the maximum 
entropy distribution Q*, subject to P =,n Q*, is a local Markov network. The 
sketch below follows the classical treatment of this topic (Berger et al., 1996; 
Lauritzen and Spiegelhalter, 1998; Dawid and Lauritzen, 1993) and is included 
for completeness. 

Formally, we wish to show that the solution to the following constrained 
maximum entropy problem is representable in G: 


Q* = argmaxg H(Q) = argmaxg > O(a) log(1/O@)) 


subject to 
(i) QG') = PG’), for alli, a’. 
(ii) Q is a proper probability distribution. 


Note first that this problem always has a unique answer since H(Q) is strictly 
concave and all constraints are linear. In addition, the feasible set is nonempty, as 
it contains P itself. 

To get the form of Q*, we solve the optimization problem by introducing 
Lagrange multipliers 4;,: (for all i and a‘) for the neighborhood marginal 
constraints (Condition 7.4.2 above); let us call 1 the resulting vector of multipliers. 
We also introduce a single Lagrange multiplier 6 for the normalization constraint 
(Condition (ii) above). The optimization then becomes 


Q* = argmaxg 5 4{L(Q, A, B)} 


= argmaxg j 5 )H(Q)+ D7 >) aia (QG') — PG')) 


ie[n] ai 


+a(Sow-1)}, 


where Q(qa) is constrained to be positive. Here, L is the Lagrangian function. 

A necessary condition for Q* is that dL /3 Q(a)|g—o« = 0, for all a such that 
P(a) > 0. After taking derivatives and some algebra, this condition implies, for 
all a, 


O#(4) = (1/Z,) | | LPG") 4 Olexp@ia), 
v=1 

where I[ P(a@') ¥ 0] is an indicator function that evaluates to 1 iff P(a') 4 0. We 
use the subscript don Q- and Z; to explicitly denote that they are parameterized 
by the Lagrange multipliers. 

It is important to note at this point that regardless of the value of the Lagrange 
multipliers, each 4; 4: is only a function of the a '. Let the dual function F (2) = 
L(Q=(a), ae 0), and let X* maximize this function. Note that those Ajai that 
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correspond to P(@') = 0 are irrelevant parameters since F (QA) is independent of 
them. So for all i and a‘ such that P(a‘) = 0, we set Az: = 0. For alli, a ‘ we 
define the functions wi(a ') = 1[P(@') £0] exp(A*.. ;). Hence, we can express 
the maximum entropy distribution Q* as, for all a, 


0% .(@) = (1/Z;.) [|] vi@, 


i=1 


which completes the proof. 


The main result of this section follows, and shows that we can represent any cor- 
related equilibria of a graphical game (G, M), up to payoff equivalence, with a local 
Markov network (G, VY). The proof follows from Lemmas 7.9, 7.10, and 7.12. 


Theorem 7.13 For all graphical games (G, M), and for any correlated equi- 
librium P of (G, M), there exists a distribution Q such that 


(i) Q is also correlated equilibrium for (G, M); 
(ii) Q gives all players the same expected payoffs as P: Q =gp P; and 
(iii) Q can be represented as a local Markov network with graph G. 


Note that the representation size for any local Markov network with graph G is 
linear in the representation size of the graphical game, and thus we can represent the 
CE of the game parsimoniously. 


Remarks. Aside from simple parsimony, Theorem 7.13 allows us to import a rich 
set of tools from the probabilistic inference literature (Pearl, 1988; Lauritzen, 1996). 
For example, it is well known that for any vertices i and j and vertex set S in a (local) 
Markov network, i and j are conditionally independent given values for the variables 
in S if and only if S separates i and j — that is, the removal of S from G leaves i and j 
in disconnected components. This, together with Theorem 7.13, immediately implies 
a large number of conditional independences that must hold in any CE outcome. Also, 
as mentioned in the Introduction, Theorem 7.13 can be interpreted as strongly limiting 
the nature of the public randomization needed to implement any given CE outcome — 
namely, only “local” sources of random bits (as defined by G) are required. 


7.4.3 Algorithms for Correlated Equilibria in Graphical Games 


Having established in Theorem 7.13 that a concise graphical game yields an equally 
concise representation of its CE up to payoff equivalence, we now turn our attention to 
algorithms for computing CE. In the spirit of our results thus far, we are interested in 
algorithms that can efficiently exploit the compactness of graphical games. 

It is well known that it is possible to compute CE via linear programming in time 
polynomial in the standard noncompact normal form. In this approach, one variable 
is introduced for every possible joint action probability P(a), and the constraints 
enforce both the CE condition and the fact that the variables must define a probability 
distribution. It is not hard to verify that the constraints are all linear and there are 
O(2") variables and constraints in the binary action case. By introducing any linear 
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optimization function, one can get an algorithm based on linear programming for 
computing a single exact CE that runs in time polynomial in the size of the normal- 
form representation of the game (i.e., polynomial in 2”). 

For graphical games this solution is clearly unsatisfying, since it may require time 
exponential in the size of the graphical game. What is needed is a more concise way 
to express the CE and distributional constraints — ideally, linearly in the size of the 
graphical game representation. As we shall now sketch, this is indeed possible for tree 
graphical games. The basic idea is to express both the CE and distributional constraints 
entirely in terms of the local marginals, rather than the global probabilities of joint 
actions. 

For the case in which the game graph is a tree, it suffices to introduce linear distri- 
butional constraints over only the local marginals, along with consistency constraints 
on the intersections of local marginals. We thus define the following three categories 
of local constraints defining a linear program: 


Variables: For every player i and assignment a ', there is a variable P(@ '). 
LP Constraints: 


(i) CE Constraints: For all players i and actions a, a’, 
>) P@ MG) = Yi P@ MAG 'Ti : a’). 
(ii) Neighborhood Marginal Constraints: For all players i, 
vai: P@')>0; )) P@')=1. 
(iii) Intersection Consistency Constraints: For all players i and j, and for any assignment 
y / to the intersection neighborhood N(i)N N(j), 


Pa")= yi P(a') 
aia ay 8 

Dy  PeD 
a ia ay i 


Pia"). 


Note that if d is the size of the largest neighborhood, this system involves O(n2¢) 
variables and O(n2“) linear inequalities, which is linear in the representation size of 
the original graphical game, as desired. This leads to the following algorithmic result. 


Theorem 7.14 = For all tree graphical games (G, M), any solution to the linear 
constraints given above is a correlated equilibrium for (G, M). 


Thus, for instance, we may choose any linear objective function F({P(a‘)}) and 
apply standard efficient linear programming algorithms in order to find a CE maximiz- 
ing F in time polynomial in the size of the graphical game. One natural choice for F 
is the social welfare, or the total expected payoff over all players: 


F({(P(@')}) = S00 PG) MiG). 


t 
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7.5 Graphical Exchange Economies 


In the same way that the graph of a graphical game represents restrictions on which 
pairs of players directly influence each other’s payoffs, it is natural to examine similar 
restrictions in classical exchange economies and other market models. In such models, 
there is typically some number k of goods available for trade, and n players or consumers 
in the economy. Each consumer has a utility function mapping bundles or amounts of 
the k goods to a subjective utility. (Settings in which the utility functions obey certain 
constraints, such as concavity or linearity, are often assumed.) Each consumer also 
has an endowment — a particular bundle of goods that they are free to trade. It is 
assumed that if prices p € (9+) are posted for the k goods, each consumer will 
attempt to sell their initial endowment at the posted prices, and then attempt to buy 
from other consumers that bundle of goods which maximizes their utility, subject to the 
amount of cash received in the sale of their endowment. A celebrated result of Arrow 
and Debreu (1954) established very general conditions under which an equilibrium 
price vector exists — prices at which all consumers are able to sell all of their intial 
endowments (no excess supply) and simultaneously able to purchase their utility- 
maximizing bundles (no excess demand). The result depends crucially on the fact that 
the model permits exchange of goods between any pair of consumers in the economy. 

A natural graph- or network-based variant of such models again introduces an 
undirected graph G over the n consumers, with the interpretation that trade is permitted 
between consumers i and j if and only if the edge (i, ) is present in G.* The classical 
equilibrium existence result can be recovered — but only if we now allow for the 
possibility of local prices, that is, prices for each good—consumer pair. In other words, 
at equilibrium in such a graphical economy, the price per unit of wheat may differ 
when purchased from different consumers, due to the effects of network topology. In 
this model, rationality means that consumers must always purchase goods from the 
neighboring consumers offering the lowest prices. 

As with graphical games, there are at least two compelling lines of research to 
pursue in such models. The first is computational: What graph topologies permit 
efficient algorithms for computing price equilibria? The second is structural: What 
can we say about how network structure influences properties of the price equilibria, 
such as the amount of price variation? Here we briefly summarize results in these two 
directions. 

On the computational side, as with the TreeNash algorithm for computing NE in 
graphical games, it is possible to develop a provably correct and efficient algorithm for 
computing approximate price equilibria in tree graphical economies with fairly general 
utility functions. Like ApproxTreeNash, this algorithm is a two-pass algorithm in 
which information regarding conditional price equilibria is exchanged between neigh- 
boring nodes, and a discrete approximation scheme is introduced. It is complementary 
to other recent algorithms for computing price equilibria in the classical non-graphical 
(fully connected) setting under linearity restrictions on the utility functions (discussed 
in detail in Chapter 5. 


2 In the models considered to date, resale of purchased goods is not permitted — rather, we have “one-shot” 
economies. 
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On the structural side, it can be shown that different stochastic models of network 
formation can result in radically different price equilibrium properties. For example, 
consider the simplified setting in which the graph G is a bipartite graph between two 
types of parties, buyers and sellers. Buyers have an endowment of 1 unit of an abstract 
good called cash, but have utility only for wheat; sellers have an endowment of 1 unit 
of wheat but utility only for cash. Thus the only source of asymmetry in the economy 
is the structure of G. If G is a random bipartite graph (i.e., generated via a bipartite 
generalization of the classical Erdos—Renyi model), then as n becomes large there will 
be essentially no price variation at equilibrium (as measured, for instance, by the ratio 
of the highest to lowest prices for wheat over the entire graph). Thus random graphs 
behave “almost” like the fully connected case. In contrast, if G is generated according 
to a stochastic process such as preferential attachment (Barabasi and Albert, 1999), the 
price variation at equilibrium is unbounded, growing as a root of the economy size n. 


7.6 Open Problems and Future Research 


There are a number of intriguing open areas for further research in the broad topics 
discussed in this chapter, including the following. 


¢ Efficient Algorithms for Exact Nash Computation in Trees. Perhaps the most notable 
technical problem left unresolved by the developments described here is that of efficiently 
(i.e., in time polynomial in the graphical game description) computing exact Nash 
equilibria for trees. This class falls between the positive results of Elkind et al. (2006) 
for unions of paths and cycles, and the recent PPAD-completeness results for bounded 
treewidth graphs (see Chapter 2). 

¢ Strategy-Proof Algorithms for Distributed Nash Computation. The NashProp al- 
gorithm described here and its variants are clearly not strategy-proof, in the sense that 
players may have incentive to deviate from the algorithm if they are to actually realize 
the Nash equilibrium they collectively compute. It would be interesting to explore the 
possibilities for strategy-proof algorithms for graphical games. 

¢ Cooperative, Behavioral, and Other Equilibrium Notions. Here we have described 
algorithms and structural results for graphical games under noncooperative equilibrium 
notions. It would be interesting to develop analogous theory for cooperative equilibria, 
such as how the coalitions that might form depend on graph topology. The recent explo- 
sion of work in behavioral game theory and economics (Camerer, 2003) is also ripe for 
integration with graphical games (and many other aspects of algorithmic game theory as 
well). For instance, one could investigate how the behavioral phenomenon of inequality 
aversion might alter the relationship between network structure and equilibrium 
outcomes. 


7.7 Bibliographic Notes 


Graphical games were introduced by Kearns et al. (2001) (abbreviated KLS hence- 
forth). Related models were introduced at approximately the same time by Koller and 
Milch (2003) and La Mura (2000). Graph-theoretic or network models of interaction 
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have a long history in economics and game theory, as surveyed by Jackson (2005); 
these models tend to be less general than graphical games, and there is naturally less 
explicit emphasis on computational issues. 

The original KLS paper contained the algorithm and analyses of the tree-based 
algorithms examined in Section 7.1. The NashProp generalization of these algo- 
rithms is due to Ortiz and Kearns (2003). A follow-up to the KLS paper by the 
same authors (Littman et al., 2002) erroneously claimed an efficient algorithm for 
computing an exact NE in tree graphical games (recall that the KLS paper gave an 
efficient algorithm only for approximate NE in trees). The error was recently dis- 
covered and discussed by Elkind et al. (2006), who proved that in fact no two-pass 
algorithm can compute an exact equilibrium. The problem of efficiently computing 
an exact equilibrium in time polynomial in the size of a tree graphical game remains 
open. 

The study of correlated equilibria in graphical games given in Section 7.4 is adapted 
from Kakade et al. (2003). Roughgarden and Papadimitriou (2005) and Papadim- 
itriou (2005) gave more general algorithms for computing correlated equilibria in 
graphical games and other compact representations. It is interesting to note that while 
the Kakade et al. results show how all correlated equilibria (up to payoff equivalance) 
can be succinctly represented as a Markov networks, Papadimitriou’s algorithm (2005) 
computes correlated equilibria that are mixtures of Nash equilibria and thus can be ef- 
ficiently sampled. Intractability results for certain correlated equilibrium computations 
are given by Gilboa and Zemel (1989), as well as by Roughgarden and Papadim- 
itriou (2005). 

Other papers providing algorithms for equilibrium computation in graphical games 
include those of Vickrey and Koller (2002), who examine hill-climbing algorithms 
for approximate NE, as well as constraint satisfaction generalizations of NashProp; 
and Daskalakis and Papadimitriou (2006), who show close connections between the 
computation of pure NE and probabilistic inference on the Markov network models 
discussed in the context of correlated equilibria in Section 7.4. 

Graphical games have also played a central role in striking recent developments 
establishing the intractability of NE computations in general multiplayer games, in- 
cluding the work by Daskalakis et al. (2006) and Goldberg and Papadimitriou (2006); 
these developments are discussed in detail in Chapter 29. Daskalakis and Papadim- 
itriou also proved intractability results for computing NE in graphical games on highly 
regular graphs (Daskalakis and Papadimitriou, 2005), while Schoenebeck and Vadhan 
(2006) systematically characterize the complexity of a variety of equilibrium-related 
computations, including NE verification and existence of pure equilibria. 

The formulation of the graphical exchange economy model summarized in Sec- 
tion 7.5, as well as the price equilibrium proof and algorithms mentioned, is due 
to Kakade et al. (2004). The result on price variation in different stochastic graph 
generation models is due to Kakade et al. (2005). 

Recently a graph-theoretic generalization of classical evolutionary game theory 
has been introduced, and it has been shown that random graphs generally preserve 
the classical evolutionary stable strategies (Kearns and Suri, 2006); these results are 
discussed in some detail in Chapter 29. 
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CHAPTER 8 


Cryptography and Game Theory 


Yevgeniy Dodis and Tal Rabin 


Abstract 


The Cryptographic and Game Theory worlds seem to have an intersection in that they both deal with 
an interaction between mutually distrustful parties which has some end result. In the cryptographic 
setting the multiparty interaction takes the shape of a set of parties communicating for the purpose 
of evaluating a function on their inputs, where each party receives at the end some output of the 
computation. In the game theoretic setting, parties interact in a game that guarantees some payoff for 
the participants according to the joint actions of all the parties, while the parties wish to maximize 
their own payoff. In the past few years the relationship between these two areas has been investigated 
with the hope of having cross fertilization and synergy. In this chapter we describe the two areas, the 
similarities and differences, and some of the new results stemming from their interaction. 

The first and second section will describe the cryptographic and the game theory settings (respec- 
tively). In the third section we contrast the two settings, and in the last sections we detail some of the 
existing results. 


8.1 Cryptographic Notions and Settings 


Cryptography is a vast subject requiring its own book. Therefore, in the following 
we will give only a high-level overview of the problem of Multi-Party Computation 
(MPC), ignoring most of the lower-level details and concentrating only on aspects 
relevant to Game Theory. 

MPC deals with the following problem. There are n > 2 parties P;,..., P,, where 
party P; holds input ¢;, 1 <i <n, and they wish to compute together a function 
8 = f(t),...,¢,) on their inputs. The goal is that each party will learn the output of 
the function, s, yet with the restriction that P; will not learn any additional information 
about the input of the other parties aside from what can be deduced from the pair 
(t;, 5). Clearly it is the secrecy restriction that adds complexity to the problem, as 
without it each party could announce its input to all other parties, and each party would 
locally compute the value of the function. Thus, the goal of MPC is to achieve the 
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following two properties at the same time: correctness of the computation and privacy 
preservation of the inputs. 


Two generalizations. The following two generalizations of the above scenario are often 
useful. 


(i) Probabilistic functions. Here the value of the function depends on some random string 
r chosen according to some distribution: s = f(t1,...,t37). An example of this is 
the coin-flipping functionality, which takes no inputs, and outputs an unbiased random 
bit. Notice, it is crucial that the value r is not controlled by any of the parties, but is 
somehow jointly generated during the computation. 

(ii) Multioutput functions. It is not mandatory that there be a single output of the function. 
More generally there could be a unique output for each party, ie., (s1,...,5n) = 
f(ti, -.-»t,). In this case, only party P; learns the output s;, and no other party learns 
any information about the other parties input and outputs aside from what can be 
derived from its own input and output. 


The parties. One of the most interesting aspects of MPC is to reach the objective of 
computing the function value, but under the assumption that some of the parties may 
deviate from the protocol. In cryptography, the parties are usually divided into two 
types: honest and faulty. An honest party follows the protocol without any deviation. 
Otherwise, the party is considered to be faulty. The faulty behavior can exemplify itself 
in a wide range of possibilities. The most benign faulty behavior is where the parties 
follow the protocol, yet try to learn as much as possible about the inputs of the other 
parties. These parties are called honest-but-curious (or semihonest). At the other end 
of the spectrum, the parties may deviate from the prescribed protocol in any way that 
they desire, with the goal of either influencing the computed output value in some way, 
or of learning as much as possible about the inputs of the other parties. These parties 
are called malicious. 

We envision an adversary A, who controls all the faulty parties and can coordinate 
their actions. Thus, in a sense we assume that the faulty parties are working together and 
can exert the most knowledge and influence over the computation out of this collusion. 
The adversary can corrupt any number of parties out of the n participating parties. Yet, 
in order to be able to achieve a solution to the problem, in many cases we would need 
to limit the number of corrupted parties. We call this limit a threshold k, indicating that 
the protocol remains secure as long as the number of corrupted parties is at most k. 


8.1.1 Security of Multiparty Computations 


We are ready to formulate the idea of what it means to securely compute a given 
function f. Assume that there exists a trusted party who privately receives the inputs 
of all the participating parties, calculates the output value s, and then transmits this 
value to each one of the parties.! This process clearly computes the correct output of 
jf, and also does not enable the participating parties to learn any additional information 


' Note that in the case of a probabilistic function the trusted party will choose r according to the specified 
distribution and use it in the computation. Similarly, for multioutput functions the trusted party will only give 
each party its own output. 
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about the inputs of others. We call this model the ideal model. The security of MPC 
then states that a protocol is secure if its execution satisfies the following: (1) the 
honest parties compute the same (correct) outputs as they would in the ideal model; 
and (2) the protocol does not expose more information than a comparable execution 
with the trusted party, in the ideal model. 

Intuitively, this is explained in the following way. The adversary’s interaction with 
the parties (on a vector of inputs) in the protocol generates a transcript. This transcript 
is arandom variable that includes the outputs of all the honest parties, which is needed 
to ensure correctness as explained below, and the output of the adversary A. The 
latter output, without loss of generality, includes all the information that the adversary 
learned, including its inputs, private state, all the messages sent by the honest parties 
to A, and, depending on the model (see later discussion on the communication model), 
maybe even include more information, such as public messages that the honest parties 
exchanged. If we show that exactly the same transcript distribution? can be generated 
when interacting with the trusted party in the ideal model, then we are guaranteed that 
no information is leaked from the computation via the execution of the protocol, as we 
know that the ideal process does not expose any information about the inputs. More 
formally, 


Definition 8.1 | Let f be a function on n inputs and let z be a protocol that 
computes the function f. Given an adversary A, which controls some set of 
parties, we define REAL, ,(t) to be the sequence of outputs of honest parties 
resulting from the execution of wz on input vector ¢ under the attack of A, in 
addition to the output of A. Similarly, given an adversary A’ which controls a set 
of parties, we define IDEAL 4 ¢(t) to be the sequence of outputs of honest parties 
computed by the trusted party in the ideal model on input vector ¢, in addition 
to the output of A’. We say that 2 securely computes f if, for every adversary 
A as above, there exists an adversary A’, which controls the same parties in the 
ideal model, such that, on any input vector t, we have that the distribution of 
REAL , ,,(¢) is “indistinguishable” from the distribution of IDEAL y ¢(t) (where 
the term “indistinguishable will be explained later). 


Intuitively, the task of the ideal adversary A’ is to generate (almost) the same output 
as A generates in the real execution (referred to also as the real model). Thus, the 
attacker A’ is often called the simulator (of A). Also note that the above definition 
guarantees correctness of the protocol. Indeed, the transcript value generated in the ideal 
model, IDEAL 4, (t), also includes the outputs of the honest parties (even though we 
do not give these outputs to A’), which we know were correctly computed by the trusted 
party. Thus, the real transcript REAL 4, (t) should also include correct outputs of the 
honest parties in the real model. 


The inputs of the faulty parties. We assumed that every party P; has an input t;, which 
it enters into the computation. However, if P; is faulty, nothing stops P; from changing 
t; into some t/. Thus, the notion of a “correct” input is defined only for honest parties. 


2 The requirement that the transcript distribution be exactly the same will be relaxed later on. 
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However, the “effective” input of a faulty party P; could be defined as the value ¢/ that 
the simulator A’ (which we assume exists for any real model A) gives to the trusted 
party in the ideal model. Indeed, since the outputs of honest parties look the same in 
both models, for all effective purposes P; must have “contributed” the same input ¢/ in 
the real model. 

Another possible misbehavior of P;, even in the ideal model, might be a refusal to 
give any input at all to the trusted party. This can be handled in a variety of ways, 
ranging from aborting the entire computation to simply assigning ft; some “default 
value.” For concreteness, we assume that the domain of f includes a special symbol 
indicating this refusal to give the input, so that it is well defined how f should be 
computed on such missing inputs. What this requires is that in any real protocol we 
detect when a party does not enter its input and deal with it exactly in the same manner 
as if the party would input in the ideal model. 


Variations on output delivery. In the above definition of security it is implicitly assumed 
that all honest parties receive the output of the computation. This is achieved by stating 
that IDEAL 4 ¢(¢) includes the outputs of all honest parties. We therefore say that our 
current definition guarantees output delivery. 

A more relaxed property than output delivery is fairness. If fairness is achieved, then 
this means that if at least one (even faulty!) party learns its outputs, then all (honest) 
parties eventually do too. A bit more formally, we allow the ideal model adversary 
A’ to instruct the trusted party not to compute any of the outputs. In this case, in the 
ideal model either all the parties learn the output, or none do. Since A’s transcript is 
indistinguishable from A’’s this guarantees that the same fairness guarantee must hold 
in the real model as well. 

Yet, a further relaxation of the definition of security is to provide only correct- 
ness and privacy. This means that faulty parties can learn their outputs, and pre- 
vent the honest parties from learning theirs. Yet, at the same time the protocol will 
still guarantee that (1) if an honest party receives an output, then this is the cor- 
rect value, and (2) the privacy of the inputs and outputs of the honest parties is 
preserved. 


Variations on the model. The basic security notions introduced above are universal and 
model-independent. However, specific implementations crucially depend on spelling 
out precisely the model where the computation will be carried out. In particular, the 
following issues must be specified: 


(i) The parties. As mentioned above, the faulty parties could be honest-but-curious or 
malicious, and there is usually an upper bound k on the number of parties that the 
adversary can corrupt. 

(ii) Computational assumptions. We distinguish between the computational setting and 
the information theoretic setting. In the information theoretic model we assume that 
the adversary is unlimited in its computing powers. In this case the term “indistin- 
guishable” in Definition 8.1 is formalized by requiring the two transcript distributions 
to be either identical (so-called perfect security) or, at least, statistically close in their 
variation distance (so-called statistical security). On the other hand, in the compu- 
tational setting we restrict the power of the adversary (as well as that of the honest 
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parties). A bit more precisely, we assume that the corresponding MPC problem is 
parameterized by the security parameter i, in which case (a) all the computation 
and communication shall be done in time polynomial in 4; and (b) the misbehavior 
strategies of the faulty parties are also restricted to be run in time polynomial in 1. 
Furthermore, the term “indistinguishability” in Definition 8.1 is formalized by com- 
putational indistinguishability: two distribution ensembles {X,,}, and {Y,}, are said to 
be computationally indistinguishable, if for any polynomial-time distinguisher D, the 
quantity €, defined as | Pr[D(X,) = 1] — Pr[D(%,) = 1]], is a “negligible” function 
of A. This means that for any j > 0 and all sufficiently large A, € eventually becomes 
smaller than 4~/. 

This modeling of computationally bounded parties enables us to build secure MPC 
protocols depending on plausible computational assumptions, such as the hardness of 
factoring large integers, etc. 

(iii) Communication assumptions. The two common communication assumptions are the 
existence of a secure channel and the existence of a broadcast channel. Secure chan- 
nels assume that every pair of parties P; and P; are connected via an authenticated, 
private channel. A broadcast channel is a channel with the following properties: 
if a party P; (honest or faulty) broadcasts a message m, then m is correctly re- 
ceived by all the parties (who are also sure the message came from P;). In partic- 
ular, if an honest party receives m, then it knows that every other honest party also 
received m. 

A different communication assumption is the existence of envelopes. An envelope 
(in its most general definition) guarantees the following properties: a value m can 
be stored inside the envelope, it will be held without exposure for a given period of 
time, and then the value m will be revealed without modification. A ballot box is an 
enhancement of the envelope setting that also provides a random shuffling mechanism 
of the envelopes. 

These are, of course, idealized assumptions that allow for a clean description of 
a protocol, as they separate the communication issues from the computational ones. 
These idealized assumptions may be realized by a physical mechanisms, but in some 
settings such mechanisms may not be available. Then it is important to address the 
question if and under what circumstances we can remove a given communication 
assumption. For example, we know that the assumption of a secure channel can be 
substituted with a protocol, but under the introduction of a computational assumption 
and a public key infrastructure. In general, the details of these substitutions are delicate 
and need to be done with care. 


8.1.2 Existing Results for Multiparty Computation 


Since the introduction of the MPC problem in the beginning of the 1980s, the work in 
this area has been extensive. We will only state, without proofs, a few representative 
results from the huge literature in this area. 


Theorem 8.2. Secure MPC protocols withstanding coalitions of up to k mali- 
cious parties (controlled by an attacker A) exist in the following cases: 
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(i) Assuming that A is computationally bounded, secure channels, and a broadcast 
channel (and a certain cryptographic assumption, implied for example, by the 
hardness of factoring, is true), then: 

(a) fork < n/2 with output delivery. 

(b) for k <n with correctness and privacy. 

(c) additionally assuming envelopes, for k <n with fairness. 

(ii) Assuming that A is computationally unbounded: 

(a) assuming secure channels, then for k < n/3 with output delivery. 

(b) assuming secure and broadcast channels, then for k < n/2 with output de- 
livery (but with an arbitrarily small probability of error). 

(c) assuming envelopes, ballot-box and a broadcast channel, then for k < n with 
output delivery. 


Structure of MPC protocols. A common design structure of many MPC protocols 
proceeds in three stages: commitment to the inputs, computation of the function on the 
committed inputs, revealing of the output. Below we describe these stages at a high 
level, assuming for simplicity that the faulty parties are honest-but-curious. 

In the first stage the parties commit to their inputs, this is done by utilizing the 
first phase of a two-phased primitive called secret-sharing. The first phase of a (k, 1)- 
secret-sharing scheme is the sharing phase. A dealer, D, who holds some secret z, 
computes n shares z1,...,Zn Of z and gives the share z; to party P;. The second 
phase is the reconstruction phase, which we describe here and utilize later. For the 
reconstruction, the parties broadcast their shares to recover z. Informally, such secret- 
sharing schemes satisfy the following two properties: (1) k, or fewer, shares do not 
reveal any information about z; but (2) any k + 1 or more shares enable one to recover 
z. Thus, up to k colluding parties learn no information about z after the sharing stage, 
while the presence of at least k + 1 honest parties allows one to recover the secret in 
the reconstruction phase (assuming, for now, that no incorrect shares are given). 

The classical secret-sharing scheme satisfying these properties is the Shamir secret- 
sharing scheme. Here we assume that the value z lies in some finite field F of cardinality 
greater than n (such as the field of integers modulo a prime p > 7). The dealer D 
chooses a random polynomial g of degree k with the only constraint that the free 
coefficient of g is z. Thus, z = g(O). Then, if a@,,...,a@, are arbitrary but agreed in 
advance nonzero elements of F,, the shares of party P; is computed as z; = g(q;). It is 
now easy to observe that any k + 1 shares z; are enough to interpolate the polynomial 
g and compute g(0) = z. Furthermore, any set of k shares is independent of z. This 
is easy to see as for any value z’ € F there exists a (k + 1)st share such that with the 
given set of k shares they interpolate a polynomial g’, where g'(0) = z’, in a sense 
making any value of the secret equally likely. Thus, properties (1) and (2) stated above 
are satisfied. 

To summarize, the first stage of the MPC is achieved by having each party P; invoke 
the first part of the secret-sharing process as the dealer D with its input ¢; as the secret, 
and distribute the correct shares of ft; to each party P;. If f is probabilistic, the players 
additionally run a special protocol at the end of which a (k, 1)-secret-sharing of a 
random and secret value r is computed. 
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In the second stage the parties compute the function f. This is done by evaluating 
the pre—agreed-upon arithmetic circuit representing f over F’, which is composed of 
addition, scalar-multiplication and multiplication gates. The computation proceeds by 
evaluating the gates one by one. We inductively assume that the inputs to the gates are 
shared in the manner described above in the secret-sharing scheme, and we guarantee 
that the output of the gate will preserve the same representation. This step forms the 
heart of most MPC protocols. The computation of the addition and scalar-multiplication 
gates are typically pretty straightforward and does not require communication (e.g., 
for the Shamir secret-sharing scheme the parties locally simply add or multiply by 
the scalar their input shares), but is considerably more involved for the multiplication 
gate and requires communication. For our purposes we will not need the details of the 
computation mechanism, simply assuming that this computation on shares is possible 
will suffice. Therefore, we can assume that at the end of the second stage the parties 
have a valid secret-sharing of the required output(s) of the function /. The most crucial 
observation is that no additional information is leaked throughout this stage, since all 
the values are always shared through a (k, )-secret-sharing scheme. 

Finally, in the last stage the parties need to compute their individual outputs of the 
function. As we have inductively maintained the property that the output of each gate 
is in the secret-sharing representation, then the same it true for the output gate of /f. 
Thus, to let the parties learn the output s, which is the value of the function, the parties 
simply run the reconstruction phase of the secret-sharing scheme (as described above), 
by having each party broadcast its share of s. 


8.2 Game Theory Notions and Settings 


Strategic games. We assume that the reader is familiar with the basic concepts 
of strategic (or “one-shot simultaneous move”) games, including the notions of 
Nash Equilibrium (NE) and Correlated Equilibrium (CE). In particular, recall from 
Chapter 1| that the class of NE corresponds to independent strategies of all the parties, 
while the class of CE — to arbitrary correlated strategies. However, in order to implement 
a given CE one generally needs a special “correlation device” — so-called mediator M — 
which will sample the prescribed strategy profile s = (s,, ..., s,) forall the parties, and 
disclose privately only action s; to each player P;. In particular, it is very important that 
P, does not learn anything about the recommended actions of the other parties, beyond 
what could be implied by its own action s;. Finally, recall that one can achieve consider- 
ably higher payoffs by playing a well-selected CE than what is possible using any given 
NE, or even what can be achieved by taking any convex combination of NE payoffs. 


Games with incomplete information. In games with incomplete information, each party 
has a private type t; € 7;, where the joint vector ¢ = (ft), ..., f,) is assumed to be drawn 
from some publicly known distribution. The point of such type, #;, is that it affects 
the utility function of party P;: namely, the utility u; depends not only on the actions 
S1,..-, Sn, but also on the private type ¢; of party P;, or, in even more general games, 
on the entire type vector ¢ of all the parties. With this in mind, generalizing the notion 
of Nash equilibrium to such games is straightforward. (The resulting Nash equilibrium 
is also called Bayesian.) 
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Mediated games generalize to the typed setting, in which parties have to send their 
types to the mediator M before receiving the joint recommendation. Depending on 
the received type vector t, the mediator samples a correlated strategy profile s and 
gives each party its recommended action s;, as before. We remark that the expected 
canonical strategy of party P; is to honestly report its type ¢; to M, and then follow the 
recommended action s;. However, P; can deviate from the protocol in two ways: (1) 
send a wrong type ¢/ or not send a type at all to M, as well as (2) decide to change 
the recommended action from s; to some s/. As a mediator may receive faulty types, a 
fully defined sampling strategy for the mediator should specify the joint distribution x 
for every type t = (t,...,¢,), even outside the support of the joint type distribution. 
Formally, x’ should be defined for every t € [];(7; U{L}), where is a special 
symbol indicating an invalid type. (In particular, games of complete information can 
be seen as a special case where all tf; = 1 and each party “refused” to report its type.) 
With this in mind, the generalization of CE to games with incomplete information is 
straightforward. 


Aborting the game. We assume that the parties will always play the game by choosing an 
action s; € S; and getting an appropriate payoff u;(s). Of course, we can always model 
refusal to play by introducing a special action into the strategy space, and defining 
the explicit utilities corresponding to such actions. Indeed, many games effectively 
guarantee participation by assigning very low payoff to actions equivalent to aborting 
the computation. However, this is not a requirement; in fact, many games do not even 
have the abort action as parts of their action spaces. To summarize, aborting is not 
something which is inherent to games, although it could be modeled within the game, 
if required. 


Extended games. So far we considered only strategic games, where parties move 
in “one-shot” (possibly with the help of the mediator). Of course, these games are 
special cases of much more general extensive form games (with complete or incomplete 
information), where a party can move in many rounds and whose payoffs depend on 
the entire run of the game. In our setting we will be interested only in a special class of 
such extensive form games, which we call (strategic) games extended by cheap-talk, 
or, in brief, extended games. 

An extended game G* is always induced by a basic strategic game G (of either 
complete or incomplete information), and has the following form. In the cheap-talk 
(or preamble) phase, parties follow some protocol by exchanging messages in some 
appropriate communication model. This communication model can vary depending on 
the exact setting we consider. But once the setting is agreed upon, the format of the 
cheap talk phase is well defined. After the preamble, the game phase will start and the 
parties simply play the original game G. In particular, the payoffs of the extended game 
are exactly the payoff that the parties get in G (and this explains why the preamble 
phase is called “cheap talk”). 

Correspondingly, the strategy x; of party P; in the extended game consists of its 
strategy in the cheap talk phase, followed by the choice of an action s; that P; will 
play in G. Just like in strategic games, we assume that the game phase must always go 
on. Namely, aborting the game phase will be modeled inside G, but only if necessary. 
However, the parties can always abort the preamble phase of the extended game, and 
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prematurely decide to move on to the game phase. Thus, a valid strategy profile for the 
extended game must include instructions of which action to play if some other party 
refuses to follow its strategy, or, more generally, deviates from the protocol instructions 
during the cheap talk phase (with abort being a special case of such misbehavior). 


Nash equilibrium of extended games. With this in mind, (Bayesian) Nash equilibrium 
for extended games is defined as before. We remark, however, that Nash equilibrium 
is known to be too liberal for extensive form games, as it allows for “unreasonable” 
strategy profiles to satisfy the definition of NE. For example, it allows for equilibrium 
strategies containing so-called “empty threats” and has other subtle deficiencies. Nev- 
ertheless, in order to keep our presentation simple, we will primarily restrict ourselves 
to the basic concept of NE when talking about extended games. 


Collusions. All the discussion so far assumed the traditional noncooperative setting, 
where agents are assumed not to form collusions. In contrast, cooperative game theory 
tries to model reasonable equilibrium concepts arising in scenarios where agents are 
allowed to form collusions. However, traditional game-theoretic treatment of such 
equilibria are fairly weak. We will come back to this issue in Section 8.4.1, where we 
provide the definition of an equilibrium that we think is the most appropriate for our 
setting and has been influenced by the MPC setting. 


8.3 Contrasting MPC and Games 


As we can see, MPC and games share several common characteristics. In both cases 
an important problem is to compute some function (5; ...5,) = f(ti,...,f:37r) ina 
private manner. However, there are some key differences summarized in Figure 8.1, 
making the translation from MPC to Games and vice versa a promising but nonobvious 
task. 


Incentives and rationality. Game theory is critically built on incentives. Although it 
may not necessarily explain why parties participate in a game, once they do, they have 
a very well defined incentive. Specifically, players are assumed to be rational and 
only care about maximizing their utility. Moreover, rationality is common knowledge: 
parties are not only rational, but know that other parties are rational and utilize this 
knowledge when making their strategic decisions. In contrast, the incentives in the 


Issue Cryptography Game Theory 
Incentive Outside the model Payoff 

Players Totally honest or malicious Always rational 
Solution drivers Secure protocol Equilibrium 
Privacy Goal Means 

Trusted party In the ideal model In the actual game 
Punishing cheaters | Outside the model Central part 

Early stopping Possible The game goes on! 
Deviations Usually efficient Usually unbounded 
k-collusions Tolerate “large” k Usually only k = 1 


Figure 8.1. Differences between Crytography and game theory. 
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MPC setting remain external to the computation, and the reason the computation 
actually ends with a correct and meaningful output comes from the assumption on the 
parties. Specifically, in the MPC setting one assumes that there exist two diametrically 
opposite kinds of parties: totally honest and arbitrarily malicious. Thus, the settings are 
somewhat incomparable in general. On the one hand, the MPC setting may be harder as 
it has to protect against completely unexplained behavior of the malicious parties (even 
if such behaviors would be irrational had the parties had the utilities defined). On the 
other hand, the Game Theory setting could be harder as it does not have the benefit of 
assuming that some of the parties (i.e., the honest parties) blindly follow the protocol. 
However, we remark that this latter benefit disappears for the basic notions of Nash 
and correlated equilibria, since there one always assumes that the other parties follow 
the protocol when considering whether or not to deviate. For such basic concepts, we 
will indeed see in Section 8.4.2 that the MPC setting is more powerful. 


Privacy and solution drivers. In the cryptographic setting the objective is to achieve a 
secure protocol, as defined in Definition 8.1. In particular, the main task is to eliminate 
the trusted party in a private and resilient way. While in the game theory setting the goal 
is to achieve “stability” by means of some appropriate equilibrium. In particular, the 
existence of the mediator is just another “parameter setting” resulting in a more desir- 
able, but harder to implement equilibrium concept. Moreover, the privacy constraint on 
the mediator is merely a technical way to justify a much richer class of “explainable” 
rational behaviors. Thus, in the MPC setting privacy is the goal while in the game 
theory setting it is a means to an end. 


“Crime and punishment” . We also notice that studying deviations from the prescribed 
strategy is an important part of both the cryptographic and the game-theoretic setting. 
However, there are several key differences. 

In cryptography, the goal is to compute the function, while achieving some security 
guarantees in spite of the deviations of the faulty parties. Most protocols also enable 
the participating parties to detect which party has deviated from the protocol. Yet, even 
when exposed, in many instances no action is taken against the faulty party. Yet, when 
an action, such as removal from the computation, is taken, this is not in an attempt to 
punish the party, but rather to enable the protocol to reach its final goal of computing 
the function. In contrast, in the game-theoretic setting it is crucial to specify exactly 
how the misbehavior will be dealt with by the other parties. In particular, one typical 
approach is to design reaction strategies that will negatively affect the payoffs of the 
misbehaving party(s). By rationality, this ensures that it is in no player’s self-interest 
to deviate from the prescribed strategy. 

We already commented on a particular misbehavior when a party refuses to partic- 
ipate in a given protocol/strategy. This is called early stopping. In the MPC setting, 
there is nothing one can do about this problem, since it is possible in the ideal model 
as well. In the Game Theory setting, however, we already pointed out that one always 
assumes that “the game goes on.” That is, if one wishes, it is possible to model stopping 
by an explicit action with explicit payoffs, but the formal game is always assumed to be 
played. Thus, if we use MPC inside a game-theoretic protocol, we will have to argue — 
from the game-theoretic point of view — what should happen when a given party aborts 
the MPC. 
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Efficiency. Most game-theoretic literature places no computational limitations on the 
efficiency of a party when deciding whether or not to deviate. In contrast, a significant 
part of cryptographic protocol literature is designed to only withstand computationally 
bounded adversaries. 


Collusions. Finally, we comment again on the issue of collusions. Most game-theoretic 
literature considers noncooperative setting, which corresponds to collusions of size 
k = 1. In contrast, in the MPC setting the case k = | is usually straightforward, and 
a lot of effort is made to make the maximum collusion threshold as high as possible. 
Indeed, in most MPC settings one can tolerate at least a linear fraction of colluding 
parties, and sometimes even a collusion of all but one party. 


8.4 Cryptographic Influences on Game Theory 


In this section we discuss how the techniques and notions from MPC and cryptography 
can be used in Game Theory. We start by presenting the notions of computational 
and k-resilient equilibria, which were directly influenced by cryptography. We then 
proceed by describing how to use appropriate MPC protocols and replace the mediator 
implementing a given CE by a “payoff-equivalent” cheap-talk phase in a variety of 
contexts. 


8.4.1 New Notions 


Computational equilibrium. Drawing from the cryptographic world, we consider set- 
tings where parties participating in the extended game are computationally bounded 
and we define the notion of computational equilibriums. In this case we only have to 
protect against efficient misbehavior strategies x;. A bit more precisely, we will assume 
that the basic game G has constant size. However, when designing the preamble phase 
of the extended game, we can parameterize it by the security parameter A, in which 
case (a) all the computation and communication shall be done in time polynomial in 
A; and (b) the misbehavior strategies x; are also restricted to be run in time polynomial 
in A. 

The preamble phase will be designed under the assumption of the existence of a 
computationally hard problem. However, this introduces a negligible probability (see 
Section 8.1.1) that within x; the attacker might break (say, by luck) the underlying 
hard problem, and thus might get considerably higher payoff than by following the 
equilibrium strategy x;*. Of course, this can improve this party’s expected payoff by at 
most a negligible amount (since the parameters of G, including the highest payoff, are 
assumed constant with respect to 4), so we must make an assumption that the party will 
not bother to deviate if its payoffs will increase only by a negligible amount. This gives 
rise to the notion of computational Nash equilibrium: a tuple of independent strategies 
Xj}, ...,X, Where each strategy is efficient in 0 such that for every P; and for every 
alternative efficient in A strategy x;, we have u;(x;, x*,;) > uj(x;, x*;) — €, where € is 
a negligible function of i. 


k-Resiliency. As we mentioned, the Game Theory world introduced several flavors of 
cooperative equilibria concepts. Yet, for our purposes here, we define a stronger type 
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of such an equilibrium, called a resilient (Nash or Correlated) equilibrium. Being a 
very strong notion of an equilibrium, it may not exist in most games. Yet, we choose to 
present it since it will exist in the “Game Theory-MPC” setting, where we will use MPC 
protocols in several game-theoretic scenarios. The possibility of realizing such strong 
equilibria using MPC shows the strength of the cryptographic techniques. Furthermore, 
with minor modifications, most of the results we present later in the chapter extend to 
weaker kinds of cooperative equilibria, such as various flavors of a more well known 
coalition-proof equilibrium.’ 

Informally, resilient equilibrium requires protection against all coalitional deviations 
that strictly benefit even one of its colluding parties. Thus, no such deviation will be 
justifiable to any member of the coalition, meaning that the equilibrium strategies 
are very stable. A bit more formally, an independent strategy profile (x/,..., x7) isa 
k-resilient Nash Equilibrium of G, if for all coalitions C of cardinality at most k, all 
correlated deviation strategies xc of the members of C, and all members P; € C, we 
have u;(xé, X*¢) = uj(%c, x*¢). Thus, no coalition member benefits by xc. 

The notion of k-resilient correlated equilibrium is defined similarly, although here 
we can have two variants. In the ex ante variant, members of C are allowed to collude 
only before receiving their actions from the mediator: namely, a deviation strategy 
will tell each member of the coalition how to change its recommended action, but this 
would be done without knowledge of the recommendations to the other members of 
the coalition. In the more powerful interim variant, the members of the coalition will 
see the entire recommended action vector sé and then can attempt to jointly change 
it to some sc. Clearly, ex ante correlated equilibria are more abundant than interim 
equilibria. For example, it is easy to construct games where already 2-resilient ex ante 
CEs achieve higher payoffs than 2-resilient interim equilibria, and even games where 
the former correlated equilibria exist and the latter do not! This is true because the ex 
ante setting makes a strong restriction that coalitions cannot form after the mediator 
gave its recommended actions. Thus, unless stated otherwise, k-resilient CE will refer 
to the interim scenario. 

Finally, we mention that one can naturally generalize the above notions to games 
with incomplete information, and also define (usual or computational) k-resilient Nash 
equilibria of extended games. 


8.4.2 Removing the Mediator in Correlated Equilibrium 


The natural question that can be asked is whether the mediator can be removed in the 
game theory setting, by simulating it with a multiparty computation. The motivation 
for this is clear, as the presence of the mediator significantly expands the number of 
equilibria in strategic form games; yet, the existence of such a mediator is a very strong 
and often unrealizable assumption. 

Recall that in any correlated equilibrium x of a strategic game G (with imperfect 
information, for the sake of generality), the mediator samples a tuple of recommended 
action (s|,...,S,) according to the appropriate distribution based on the types of 


3 Informally, these equilibria prevent only deviations benefiting all members of the coalition, while resilient 
equilibria also prevent deviations benefiting even a single member. 
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the parties. This can be considered as the mediator computing some probabilistic 
function (s1,...,8,) = f(t, ...,t317). We define the following extended game G* of 
G by substituting the mediator with an MPC and ask whether the extended game is a 
(potentially computational) Nash equilibrium. 


(i) In the preamble stage, the parties run an “appropriate” MPC protocol* to compute the 
profile (s|,..., 5,). Some additional actions may be needed (see below). 

(ii) Once the preamble stage is finished, party P; holds a recommended action s;, which it 
uses in the game G. 


Meta-Theorem. Under “appropriate” conditions, the above strategies form a (poten- 
tially computational) Nash equilibrium of the extended game G*, which achieves the 
same expected payoffs for all the parties as the corresponding correlated equilibrium 
of the original game G.° 


As we discussed in Section 8.3, there are several differences between the MPC and 
the game theory settings. Not surprisingly, we will have to resolve these differences 
before validating the meta-theorem above. To make matters a bit more precise, we 
assume that 


* x is an interim k-resilient correlated equilibrium® of G that we are trying to simulate. 
k = 1| (Le., no collusions) will be the main special case. 

e the MPC protocol computing x is cryptographically secure against coalitions of up to 
k malicious parties. This means the protocol is at least correct and private, and we will 
comment about its “output delivery” guarantees later. 

¢ The objective is to achieve a (possibly computational) k-resilient Nash equilibrium x* 
of G* with the same payoffs as x. 


Now the only indeterminant in the definition of G* is to specify the behavior of the 
parties in case the MPC computation fails for some reason. 


Using MPC with guaranteed output delivery. Recall that there exist MPC protocols (in 
various models) that guarantee output delivery for various resiliencies k. Namely, the 
malicious parties cannot cause the honest parties not to receive their output. The only 
thing they can do is to choose their inputs arbitrarily (where a special input L indicates 
they refuse to provide the input). But since this is allowed in the mediated game as 
well, and k-resilient equilibrium ensures the irrationality of such behavior (assuming 
the remaining (n — k) parties follow the protocol), we know the parties will contribute 
their proper types and our meta-theorem is validated. 


Theorem 8.3 /f x is a k-resilient CE of G specified by a function f, and x is 
an MPC protocol (with output delivery) securely computing f against a coalition 
of up to k computationally unbounded/bounded parties, then running x in the 
preamble step (and using any strategy to select a move in case some misbehavior 


4 Where the type of the protocol depends on the particular communication model and the capabilities of the 
parties. 

> Note that the converse (every NE of G* can be achieved by a CE of G) is true as well. 

® As we already remarked, the techniques presented here easily extend to weaker coalitional equilibria concepts. 
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occurs) yields a k-resilient regular/computational NE of the extended game G*, 
achieving the same payoffs as x. 


Using fair MPC. In some instances (e.g., part i.c of Theorem 8.2) we cannot guarantee 
output delivery, but can still achieve fairness. Recall, this means that if at least one 
party P; obtains its correct output s;, then all parties do. However, it might be possible 
for misbehaving parties to cause everybody to abort or complete the protocol without 
an output. 

In the case where the protocol terminates successfully, we are exactly in the same 
situation as if the protocol had output delivery, and the same analysis applies. In the 
other case, we assume that the protocol enables detection of faulty behavior and that it 
is observed that one of the parties (for simplicity, assume that it is P,,) deviated from 
the protocol. As the protocol is fair, the aborting deviation must have occurred before 
any party has any information about their output. The simplest solution is to restart 
the computation of x from scratch with all parties. The technical problem with this 
solution is that it effectively allows (a coalition containing) P,, to mount a denial of 
service attack, by misbehaving in every MPC iteration causing the preamble to run 
forever. 

Instead, to make the extended game always finite, we follow a slightly more so- 
phisticated punishment strategy. We restart the preamble without P,,, and let the 
(n — 1) remaining parties run a new MPC to compute the (n — 1)-input function f’ 
on the remaining parties’ inputs and a default value | for P,: f"(t,...,t%—-137r) = 
f(t, .-.5t-1, L317). Notice that in this new MPC n is replaced by n — 1 and k re- 
placed by k — 1 (as P,, is faulty), which means that the ratio — < x and, thus, f’ 
can still be securely computed in the same setting as f. Also notice that P,, does not 
participate in this MPC, and will have to decide by itself (or with the help of other 
colluding members) which action to play in the actual game phase. In contrast, parties 
P,,..., Pn—1 are instructed to follow the recommendations they get when computing 
f’, if f’ completes. If not, then another party (say, P,—1) must have aborted this MPC, 
in which case we reiterate the same process of excluding P,-1, and so on. Thus, at 
some point we have that the process will end, as there is a finite number n of parties 
and we eliminate (at least) one in each iteration. 

Next, we argue that the resulting strategy profile x* forms a k-resilient Nash equi- 
librium of G*. To see this, the fairness of the MPC step clearly ensures that the only 
effective misbehavior of a coalition of size |C| is to declare invalid types L for some of 
its members, while changing the real type for others. In this case, their reluctance to do 
so follows from the fact that such misbehavior is allowed in the mediated game as well. 
And since we assumed that the strategy profile x is a k-resilient correlated equilibrium 
of G, it is irrational for the members of the coalition to deviate in this way. 


Using correct and private MPC: Case k = 1. We can see that the previous argument 
crucially relied on the fairness of the MPC. In contrast, if the MPC used only provides 
correctness and privacy, then the members of C might find their vector of outputs 
sc before the remaining parties, and can choose to abort the computation precisely 
when one of their expected payoffs p} = Exp(uj(s) | sc = s¢) when playing s¢ is 
less than the a priori value p; = Exp(u;(s)). In fact, even for two-players games of 
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complete information, it is easy to construct a game G (e.g., the “Game of Chicken” in 
Chapter 1) where the above aborting strategy of the player who learns the output 
first will be strictly profitable for this player, even if the other player will play its 
“conditional” strategy suggested in the previous paragraph. 

Nevertheless, we show that one can still use unfair (yet private and correct) MPC 
protocols in an important special case of the problem. Specifically, we concentrate 
on the usual coalition-free case k = 1, and also restrict our attention to games with 
complete information (i.e., no types). In this case, we show that if some party P; deviates 
in the MPC stage (perhaps by aborting the computation based on its recommended 
action), the remaining parties P_; can sufficiently punish P; to discourage such an 
action. Let the min-max value v; for party P; denote the worst payoff that players P_; 
can jointly enforce on P;: namely, v; = minz_,cacs_;) MAXs,cs, Uj(S;, Zi). 


Claim 8.4 For any correlated equilibrium x of G, any P; and any action s; for 
P; in the support of x;, Exp(uj(s) | 8; = s{) = v;. 


PROOF Notice that since x is a CE, s; is the best response of P; to the profile 
X_; defined as x_; conditioned on s; = s/. Thus, the payoff P; gets in this case is 
what others would force on P; by playing x_;, which is at least as large as what 
others could have selected by choosing the worst profile z_;. 


Now, in case P; would (unfairly) abort the MPC step, we will instruct the other 
parties P_; to punish P; to its min—max value v;. More specifically, parties P_; should 
play the correlated strategy z_;, which would force P; into getting at most v;. Notice, 
however, since this strategy is correlated, they would need to run another MPC protocol 
to implement z_;,’ By the above claim, irrespective of the recommendation s; that P; 
learned, the corresponding payoff of P; can only go down by aborting the MPC. 
Therefore, it is in P;’s interests not to abort the computation after learning s;. 

We notice that the above punishment strategy does not straightforwardly generalize 
to more advanced settings. For example, in case of coalitions it could be that the min— 
max punishment for P; tremendously benefits another colluding party P2 (who poses 
as honest and instructs P; to abort the computation to get high benefits for itself). Also, 
in the case of incomplete information, it is not clear how to even define the min—max 
punishment, since the parties do not even know the precise utility of P;! 


8.4.3 Stronger Equilibria 


So far we talked only about plain Nash equilibria of the extended game G*. As we 
already commented briefly, Nash equilibria are usually too weak to capture extensive- 
form games. Therefore, an interesting (and still developing!) direction in recent research 
is to ensure much stronger and more stable equilibria that would simulate correlated 
equilibria of the original game. 


Eliminating empty threats. One weakness of the Nash equilibrium is that it allows for 
the so-called empty threats. Consider, for example, the min—max punishment strategy 


7 Notice that there are no dishonest parties left, so any MPC protocol for the honest-but-curious case would work. 
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used above. In some games, punishing a misbehaving party to its min—max value 
is actually very damaging for the punishers as well. Thus, the threat to punish the 
misbehaving party to the min—max value is not credible in such cases, despite being 
an NE. In this case, eliminating such an empty threat could be done by modifying the 
punishment strategy to playing the worst Nash equilibrium of G for P; (in terms of P;’s 
payoff) when P; is caught cheating. Unlike the min—max punishment, this is no longer 
an empty threat because it is an equilibrium of G. However, it does limit (although 
slightly) the class of correlated equilibria one can simulate, as one can achieve only a 
payoff vector which is at least as large as the worst Nash equilibrium for each player. 
In addition, formally defining such so-called subgame-perfect or sequential equilibria 
has not yet been done in the computational setting, where most MPC protocols are 
analyzed. 


Ex ante correlated equilibria. So far we only talked about simulating interim corre- 
lated equilibria, where colluding parties can base their actions after seeing all their 
recommendations. Another interesting direction is that of simulating ex ante corre- 
lated equilibria, where colluding parties can only communicate prior to contacting 
the mediator. To implement this physical restriction in real life, we need to design 
collusion-free protocols, where one has to ensure that no subliminal communication 
(a.k.a. steganography) is possible. This is a very difficult problem. Indeed, most cryp- 
tographic protocols need randomness (or entropy), and it is known that entropy almost 
always implies steganography. In fact, it turns out that, in order to build such protocols, 
one needs some physical assumptions in the real model as well. On a positive side, it 
is known that envelopes (and a broadcast channel) are enough for building a class of 
collusion-free protocols sufficient to simulate ex ante correlated equilibria without the 
mediator. 


Iterated deletion of weakly dominated strategies. In Section 8.5.2 we will study 
a pretty general class of “function evaluation games,” where the objective is to 
achieve Nash equilibrium that survives so-called iterated deletion of weakly dominated 
strategies. 


Strategic and privacy equivalence. The strongest recent results regarding removing 
the mediator is to ensure (polynomially efficient) “real-life” simulation that guaran- 
tees an extremely strong property called strategic and privacy equivalence. Intuitively, 
it implies that our simulation gives exactly the same power in the real model as in 
the ideal model. As such, it precisely preserves all different types of equilibria of 
the original game (e.g., without introducing new, unexpected equilibria in the ex- 
tended game, which we allowed so far), does not require the knowledge of the utility 
functions or an a priori-type distribution (which most of the other results above do), 
does not give any extra power to arbitrary coalitions, preserves privacy of the play- 
ers types as much as in the ideal model, and has other attractive properties. Not 
surprisingly, strategic and privacy equivalence is very difficult to achieve, and re- 
quires some physical assumptions in the real model as well. The best known result 
is an extension of the MPC result ii.c in Theorem 8.2, and shows how to imple- 
ment strategic and privacy equivalence assuming a broadcast channel, envelopes and a 
ballot box. 
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To summarize, MPC techniques are promising in replacing the mediator by cheap 
talk in a variety of situations. However, more work has to be done in trying to achieve 
stronger kinds of equilibria using weaker assumptions. 


8.5 Game Theoretic Influences on Cryptography 


The influence of Game Theory on Multiparty Computation has exemplified itself in 
modeling multiparty computation with a game-theoretic flavor by introducing rational 
parties with some natural utility functions into the computation. Once this is done, 
two main areas of investigation are as follows. First, we try to characterize the class 
of functions where it is in the parties’ selfish interest to report their true inputs to the 
computation. We call such functions noncooperatively computable (NCC). Second, we 
can ask to what extent the existing MPC protocols (used to compute NCC functions) 
form an appropriate equilibrium for the extended game, where we remove the trusted 
mediator by cheap talk computing the same function. As we see, the answer will depend 
on the strength of the equilibrium we desire (and, of course, on the natural utilities we 
assign to the “function evaluation game” defined below). Furthermore, issues arising 
in the MPC “honest vs. malicious” setting also hold in the Game Theory “rational” 
setting, further providing a synergy between these two fields. 


8.5.1 Noncooperatively Computable Functions 


In order to “rationalize” the process of securely evaluating a given function f, we first 
need to define an appropriate function evaluation game. For concreteness, we concen- 
trate on single-output functions f(t), ..., f,), although the results easily generalize to 
the n-output case. We also assume that each input ¢; matters (i.e., for some t_; the value 
of f is not yet determined without ¢;). 


Function evaluation game. We assume that the parties’ types ¢; are their inputs to f 
(which are selected according to some probability distribution D having full support). 
The action of each party P; is its guess about the output s* of f. The key question, 
however, is how to define the utilities of the parties. Now, there are several natural 
cryptographic considerations that might weight into the definition of party P;’s utility. 


¢ Correctness. Each P; wishes to compute f correctly. 

¢ Exclusivity. Each P; prefers others parties P; not to learn the value of f correctly. 

¢ Privacy. Each P; wishes to leak as little as possible about its input ¢; to the other parties. 
¢ Voyeurism. Each P; wishes to learn as much as possible about the other parties’ inputs. 


Not surprisingly, one can have many different definitions for a cryptographically 
motivated utility function of party P;. In turn, different definitions would lead to 
different results. For concreteness, we will restrict ourselves to one of the simplest and, 
arguably, most natural choices. Specifically, we will consider only correctness and ex- 
clusivity, and value correctness over exclusivity. However, other choices might also be 
interesting in various situations, so our choice here is certainly with a loss of generality. 

A bit more formally, recall that the utility u; of party P; depends on the true type 
vector ¢ of all the parties, and the parties’ actions 5), ..., 5,. Notice that the true type 
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vector ¢ determines the correct function value s* = f(t), and parties’ actions determine 
the boolean vector correct = (correct), ..., Correct,), where correct; = 1 if an only 
if s; = s*. In our specific choice of the utility function, we will assume that the utilities 
of each party depend only on the boolean vector correct: namely, which of the parties 
learned the output and which did not. Therefore, we will write u;(correct) to denote 
the utility of party P;. Now, rather than assigning somewhat arbitrary numbers to 
capture correctness and exclusivity, we state only the minimal constraints that imply 
these properties. Then, the correctness constraint states that u;(correct) > u;(correct’), 
whenever correct; = | and correct, = 0. Similarly, exclusivity constraint states that 
if (a) correct; = correct;, (b) for all j #i we have correct; < correct, while (c) 
for some j actually correct; = 0 and correct; = 1, then u;(correct) > u;(correct’). 
Namely, provided P; has the same success in learning the output, it prefers as few 
parties as possible to be successful. 


Noncooperatively computable functions. Having defined the function evaluation game, 
we can now ask what are the equilibria of this game. In this case, Nash equilibria are not 
very interesting, since parties typically have too little information to be successful with 
any nontrivial probability. On the other hand, it is very interesting to study correlated 
equilibria of this game. Namely, parties give their inputs ¢; to the mediator M, who then 
recommends an action s; for each party. Given that each party is trying to compute 
the value of the function f, it is natural to consider “canonical” mediator strategy: 
namely, that of evaluating the function f on the reported type vector t, and simply 
recommending each party to “guess” the resulting function value s* = f(t). Now, we 
can ask the question of characterizing the class of functions f for which this canonical 
strategy is indeed a correlated equilibrium of the function evaluation game. To make 
this precise, though, we also need to define the actions of the mediator if some party 
gives a wrong type to the mediator. Although several options are possible, here we 
will assume that the mediator will send an error message to all the parties and let them 
decide by themselves what to play. 


Definition 8.5 We say that a function f is noncooperatively computable (NCC) 
with respect to utility functions {u;} (and a specific input distribution D) if the 
above canonical mediated strategy is a correlated equilibrium of the function 
evaluation game. Namely, it is in the parties’ selfish interest to honestly report 
their true inputs to the mediator. 


We illustrate this definition by giving two classes of functions that are never NCC. 
Let us say that a function f is dominated if there exists an index i and an input 
t;, which determine the value of f irrespective of the other inputs t_;. Clearly, for 
such an input ¢; it is not in the interest of P; to submit ¢; to the mediator, as P; 
is assured of correct; = 1 even without the help of M, while every other party is 
not (for at least some of its inputs). Thus, dominated functions cannot be NCC. For 
another example, a function f is reversible if for some index i and some input ¢;, 
there exists another input ¢/ and a function g, such that (a) for all other parties’ inputs 
t_; we have g(f(t/, t_i), ti) = f(t;, t-i), and (b) for some other parties’ inputs t_; 
we have f(t/, t_i) # f(t, t-i). Namely, property (a) states that there is no risk in 
terms of correctness for P; to report t/ instead of t;, while property (b) states that 
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at least sometimes P; will be rewarded by higher exclusivity. A simple example of 
such (boolean) function is the parity function: negating one’s input always negates the 
outcome, but still in a manner easily correctable by negating the output back. Clearly, 
reversible functions are also not NCC. 

In general, depending on the exact utilities and the input distribution D, other 
functions might also be non-NCC. However, if we assume that the risk of losing 
correctness is always too great to be tempted by higher exclusivity, it turns out that these 
two classes are the only non-NCC functions. (And, thus, most functions, like majority, 
are NCC.) More precisely, assume that the utilities and the input distribution D are 
such that for all vectors correct, correct’, correct” satisfying correct; = correct; = 1, 
correct; = 0, we have u; (correct) > (1 — €)u,(correct’) + €u;(correct”), where € is 
the smallest probability in D. Namely, if by deviating from the canonical strategy 
there is even a minuscule chance of P; not learning the value of f correctly, this loss 
will always exceed any potential gain caused by many other parties not learning the 
outcome as well. In this case we can show the following: 


Theorem 8.6 Under the above assumption, a function f is NCC if and only if 
it is not dominated and not reversible.® 


Collusions. So far we concentrated on the case of no collusions; i.e., k = 1. However, 
one can also define (a much smaller class of) k-Non-Cooperatively Computable (k- 
NCC) functions, for which no coalition of up to k parties has any incentive to deviate 
from the canonical strategy of reporting their true types. One can also characterize 
k-NCC functions under appropriate assumptions regarding the utilities and the input 
distribution D. 


8.5.2 Rational Multiparty Computation 


Assume that a given function f is k-NCC, so it is in the parties’ own interest to 
contribute their inputs in the ideal model. We now ask the same question as in Section 
8.4: can we replace the mediator computing f by a corresponding MPC protocol for 
f? Notice, by doing so the parties effectively run the cryptographic MPC protocol 
for computing f. Thus, a positive answer would imply that a given MPC protocol 
x securely computes f not only from a cryptographic point of view but also from a 
game-theoretic, rational point of view! Fortunately, since the function evaluation game 
is just a particular game, Theorem 8.3 immediately implies 


Theorem 8.7 [f f is a k-NCC function (w.rt. to some utilities and input dis- 
tribution) and a is an MPC protocol securely computing f against a coalition 
of up to k computationally unbounded/bounded parties, then m is a k-resilient 
regular/computational Nash equilibrium for computing f in the corresponding 
extended game. 


From a positive perspective, this result shows that for the goal of achieving just a 
Nash equilibrium, current MPC protocols can be explained in rational terms, as long 


8 Tn fact, under our assumption that each party’s input matters in some cases and D has full support, it is easy to 
see that every dominated function is also reversible. 
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as the parties are willing to compute f in the ideal model. From a negative perspective, 
the latter constraint nontrivially limits the class of functions f, which can be rationally 
explained, and it is an interesting open problem how to rationalize MPC even for 
non-NCC functions, for which the cryptographic definition still makes perfect sense. 


Stronger equilibria. As another drawback, we already mentioned that the notion of 
Nash equilibrium is really too weak to capture the rationality of extensive-form pro- 
cesses, such as multiparty computation protocols. Thus, an important direction is to 
try achieving stronger kinds of equilibria explaining current MPC protocols, or, alter- 
natively, design robust enough MPC protocols which would achieve such equilibria. 
In Section 8.4.3, we briefly touched on several general results in this direction (which 
clearly still apply to the special case of the function evaluation games). Here we will 
instead concentrate on the specifics of computing the function under the correctness 
and exclusivity preferences defined in the previous section, and will study a specific 
refinement of the Nash equilibrium natural for these utility functions. 

To motivate our choice, let us see a particular problem with current MPC protocols. 
Recall, such protocols typically consist of three stages; in the first two stages the parties 
enter their inputs and compute the secret-sharing of the output of f, while the last stage 
consists of the opening of the appropriate output shares. Now we claim that the strategy 
of not sending out the output shares is always at least as good as, and sometimes better 
than, the strategy of sending the output shares. Indeed, consider any party P;. The 
correctness of output recovery for P; is not affected by whether or not P; sent his own 
share, irrespective of the behavior of the other parties. Yet, not sending the share to 
others might, in some cases, prevent others from reconstructing their outputs, resulting 
in higher exclusivity for P;. True, along the Nash equilibrium path of Theorem 8.7, 
such cases where the share of P; was critical did not exhibit themselves. Still, in reality 
it seems that there is no incentive for any party to send out their shares, since this 
is never better, and sometimes worse than not sending the shares. This motivates the 
following definition. 


Definition 8.8 We say that a strategy s € S; is weakly dominated by s' € S; 
with respect to S_; if (a) there exists s_; € S_; such that uj;(s, s_;) < u;(s’, s_;) 
and (b) for all strategies s’ , € S_; we have that u;(s, s’;) < u;(s’, s’_;). We define 
iterated deletion of weakly dominated strategies (IDOWDS) as the following 
process. Let DOM;(S}, ..., 5;,,) denote the set of strategies in $; that are weakly 
dominated with respect to S_;. Let s? = §; and for j > 1 define RY inductively as 
si! — s/-'\DOM,(si"', sects sity and let S°° = Oj si. Finally, we say that 
a Nash equilibrium (x1, ...,x,) survives IDoWDS, if each x; is fully supported 
within S?°. 


k-resilient Nash equilibria surviving IDoWDS are defined similarly.? 
Now, the above discussion implies that the k-resilient Nash equilibrium from Theo- 
rem 8.7 does not survive IDoWDS. On a positive side, the only reason for that was that 


° We notice that, in general, it matters in which order 1 removes the weakly dominated strategies. The specific 
order chosen above seems natural, however, and will not affect the results we present below. 
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the basic secret-sharing scheme where the parties are instructed to blindly open their 
shares does not survive IDoWDS. It turns out that the moment we fix the secret-sharing 
scheme to survive IDoWDS, the resulting Nash equilibrium for the function evaluation 
game will survive IDoWDS too, and Theorem 8.7 can be extended to Nash equilibrium 
surviving IDoWDS. Therefore, we will treat only the latter, more concise problem. We 
remark, however, that although a Nash equilibrium surviving IDoWDS is better than 
plain Nash equilibrium, it is still a rather weak concept. For example, it still allows for 
“empty threats,” and has other undesirable properties. Thus, stronger equilibria are still 
very desirable to achieve. 


Rational secret-sharing. Recall, in the (k, n)-secret-sharing problem the parties are 
given (random valid) shares z;,...,Z, of some secret z, such that any k shares leak 
no information about z, while any k + 1 or more shares reveal z. We can define the 
secret-sharing game, where the objective of each party is to guess the value of z, and 
where we assume that parties’ utilities satisfy the correctness and exclusivity constraints 
defined earlier. In the extended game corresponding to the secret-sharing game, the 
parties can perform some computation before guessing the value of the secret. For our 
communication model, we assume that it is strong enough to perform generic multiparty 
computation, since this will be the case in the application to the function evaluation 
game. (On the other hand, we will need only MPC with correctness and privacy, and not 
necessarily fairness.) In addition, if not already present, we also assume the existence of 
a simultaneous broadcast channel, where at each round all parties can simultaneously 
announce some message, after which they atomically receive the messages of all the 
other parties. Our goal is to build a preamble protocol for which the outcome of all 
the parties learning the secret z will be a k-resilient Nash equilibrium for the extended 
game that survives IDoWDS. 

As we observed already, the natural 1-round preamble protocol where each party 
is supposed to simply broadcast its share does not survive IDoWDS. In fact, a simple 
backward induction argument shows that any preamble protocol having an a priori fixed 
number of simultaneous broadcast rounds (and no other physical assumptions, such as 
envelopes and ballot boxes) cannot enable the parties to rationally learn the secret and 
survive IDoWDS. Luckily, it turns out that we can have probabilistic protocols with no 
fixed upper bound on the number of rounds, but which have a constant expected number 
of rounds until each party learns the secret. We sketch the simplest such protocol below. 
W.l.0.g. we assume that the domain of the secret-sharing scheme is large enough to 
deter random guessing of z, and also includes a special value denoted _L, such that z is 
guaranteed to be different from _L. 

Let a € (0, 1) be a number specified shortly. At each iteration r > 1, the parties do 
the following two steps: 


(i) Run an MPC protocol on inputs z; which computes the following probabilistic 
functionality. With probability @, compute fresh and random (k, n)-secret-sharing 
zi,---»Z, Of z, where party P; learns z;. Otherwise, with probability 1 — a compute 
a random (k, m)-secret-sharing rae ..., 2), of L, where party P; learns ja 


10 This protocol is typically pretty efficient for the popular Shamir’s secret-sharing scheme. 
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(ii) All parties P; simultaneously broadcast z; to other parties. 

(iii) If either the MPC protocol fails for even one party, or even one party fails to broadcast 
the value z;, all parties are instructed to abort. 

(iv) Each party tries to recover some value z’ from the shares received from the other 
parties. If the recovery fails, or at least one share is inconsistent with the final value 
z’, the party aborts the preamble. Otherwise, if z’ = L the parties proceed to the next 
iteration, while in case z’ 4 L the parties stop the preamble and output z’ as their 
guess for z. 


Notice, by the privacy of the MPC step, no coalition C of up to k parties knows if 
the value z’ is equal to z or L. Thus, in case this coalition chooses not to broadcast 
their shares, they will learn only the value z (while punishing all the other parties) with 
probability a, and not learn the value z forever with probability 1 — a. Thus, if @ is 
small enough (depending on the particular utilities), the risk of not learning the secret 
will outweigh the gain of achieving higher exclusivity. Also, it is easy to see that no 
strategy of the above protocol is weakly dominated by another strategy, so the above 
Nash equilibrium survives IDoWDS. 

The above protocol works for any k. However, it runs in expected O(1/q) iterations, 
which is constant, but depends on the specific utilities of the parties (and the value 
k). Somewhat more sophisticated protocols are known to work for not too large k, but 
have expected number of iterations which is independent of the utilities. These results 
are summarized without further details below. 


Theorem 8.9 = Assume that the parties utilities satisfy correctness over exclu- 
sivity properties for the (k,n)-secret-sharing game. Then there exists k-resilient 
Nash equilibria for the extended game that survive IDoWDS and run in expected 
constant number of iterations r, where 


¢ k <n, butr depends on the specific utilities. 


¢ k <n/2,r is fixed, but the parties still need to know a certain parameter depending 
on the specific utilities. 


¢ k <n/3,r is fixed, and no other information about the utilities is needed. 


8.6 Conclusions 


As we have seen, the settings of MPC in cryptography and correlated equilibrium 
in game theory have many similarities, as well as many differences. Existing results 
so far started to explore these connections, but much work remains to be done. For 
example, can we use some flavors of MPC to remove the mediator, while achiev- 
ing very strong types of Nash equilibria, but with more realistic physical and other 
setup assumptions? Or, can we use game theory to “rationalize” MPC protocols for 
non-NCC functions (such as parity), or to explain other popular cryptographic tasks 
such as commitment or zero-knowledge proofs? In addition, so far “rationalizing” 
MPC using game theory resulted only in more sophisticated protocols. Are there nat- 
ural instances where assuming rationality will simplify the design of cryptographic 
tasks? 
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8.7 Notes 


The multiparty computation problem (Section 8.1) was introduced in Yao (1982). 
The basic definitional and construction approaches were introduced by Goldreich 
et al. (1987), in particular the paradigm of a real/ideal execution. In Section 8.1.1 
we follow the definitional framework of Canetti (2000), which is based on the works 
of Goldwasser and Levin (1990), Micali and Rogaway (1991), and Beaver (1991). 
The results mentioned in Theorem 8.2 are from the following: parts i.a and i.b from 
Goldreich et al. (1987), part i.c from Lepinski et al. (2004), part ii.a from Ben-Or et al. 
(1988) and Chaum et al. (1988), part ii.b from Rabin and Ben-Or (1989) and Beaver 
(1991), part ii.c from Izmalkov et al. (2005). The secret-Sharing protocol presented is 
Shamir’s Secret-Sharing (1979). The notion of indistinguishability was introduced in 
Goldwasser and Micali (1984). For a more formal and in-depth discussion on multiparty 
computations see Goldreich (2004). 

In Section 8.2 we present the classical results of Nash (1951) and Aumann (1974) for 
Nash and correlated equilibrium (respectively). The extension of correlated equilibrium 
to games with incomplete information is due to Forges (1986). The notion of extended 
games is from Barany (1992). For a broader game theory background, see the book by 
Osborne and Rubinstein (1999). 

The comparison discussion between Game Theory and Cryptography, as it appears 
in Section 8.3, was initiated by Dodis et al. (2000) and later expanded by Feigebaum 
and Shenker (2002); yet here we further expand on these points. The related discussion 
was also carried out in many other works (Abraham et al., 2006; Barany, 1992; Lepinski 
et al., 2004; Izmalkov et al., 2005). 

The notion of computational equilibrium which appears in Section 8.4.1 was intro- 
duced in Dodis et al. (2000). The work of Urbano and Vila (2002, 2004) also deals 
with the computational model, but does not explicitly define this notion. The impor- 
tance of tolerating collusions was first addressed in our setting by Feigenbaum and 
Shanker (2002). For the k-resilient equilibrium we chose the formulation of Abraham 
et al. (2006), as we felt it best suited our presentation. For other related formulations, 
see the references in Abraham et al. (2006), and also a recent work of Lysyanskaya 
and Triandopoulos (2006). The results which appear in Section 8.4.2 appear in the 
following. Theorem 8.3 follows by combining results such as Dodis et al. (2000), 
Barany (1992), Ben-Porath (1998), Gerardi (2004), Urbano and Vila (2002, 2004) and 
Abraham et al. (2006). The result for using fair MPC appears in Lepinski et al. (2004). 
The introduction of a min-max punishment to deal with unfair MPC in the attempt to 
remove the mediator appears in Dodis et al. (2000). For some efficiency improvements 
to the protocol of Dodis et al. (2000), see the works of Teague (2004) and Attalah 
et al. (2006). The results which appear in Section 8.4.2 appear in the following. The 
worst equilibrium punishment technique was first applied to unmediated games by 
Ben-Porath (1998). The notion of collusion free protocols which is used to implement 
ex ante equilibria is from the work of Lepinski et al. (2005). The result of achieving 
strategic and privacy equivalence under physical assumptions is from Izmalkov et al. 
(2005). 
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The noncooperative computation formulation and some discussion used in Section 
8.5.1 are introduced (for k = 1) by Shoham and Tennenholtz (2005), and expanded 
by McGrew et al. (2003). Theorem 8.6 is also from Shoham and Tennenholtz (2005), 
while the formulation of “correctness followed by exclusivity” utilities is from Halpern 
and Teague (2004). The results in Section 8.5.2 appear as follows: the introduction of 
rational secret-sharing surviving IDowDS and the impossibility result of reaching it in a 
fixed number of rounds are from Halpern and Teague (2004). The protocol for rational 
secret-sharing we present appears in Abraham et al. (2006) and (for k = 1) by Gordon 
and Katz (2006). Yet, a more complicated and less general solution along these lines 
appeared first (for k = 1) in Halpern and Teague (2004). Theorem 8.9 is from Abraham 
et al. (2006). For a different, but related “mixed MPC” model, see Lysyanskaya and 
Triandopoulos (2006). 


Acknowledgments 


We thank the following people for extensive discussions, explanations, and general ad- 
vice: Ittai Abraham, Ran Canetti, Hugo Krawczyk, Matt Lepinski, Anna Lysyanskaya, 
Silvio Micali, abhi shelat, and Nikos Triandopoulos, and give special thanks to our 
coauthor Shai Halevi. 


Bibliography 


I. Abraham, D. Dolev, R. Gonen, and J. Halpern. Distributed computing meets game theory: Ro- 
bust mechanisms for rational secret-sharing and multiparty computation. In Princ. of Distributed 
Computing 06, pp. 53-62. ACM Press, 2006. 

M. Atallah, M. Blanton, K. Frikken, and J. Li. Efficient Correlated Action Selection. In Financial 
Crypt., LNCS 4107:296-310. Springer, 2006. 

R. Aumann. Subjectivity and correlation in randomized strategies. J. Math. Econ., 1:67—96, 1974. 

I. Barany. Fair Distribution Protocols or How the Players Replace Fortune. Math. Oper. Res., 
17(2):327-341, 1992. 

D. Beaver. Secure multiparty protocols and zero-knowledge proof systems tolerating a faulty minority. 
J. Cryptology, 4(2):75—-122, 1991. 

M. Ben-Or, S. Goldwasser, and A. Wigderson. Completeness theorems for noncryptographic fault- 
tolerant distributed Computations. In Proc. 20th Symp. on Theory of Computing 88, pp. 1-10. 

E. Ben-Porath. Correlation without mediation: Expanding the set of equilibrium outcomes by “cheap” 
pre-play procedures. J. Econ. Theo., 80(1):108-122, 1998. 

R. Canetti. Security and composition of multiparty cryptographic protocols. J. Cryptology, 13(1):143— 
202, 2000. Available at eprint.iacr.org/1998/018. 

D. Chaum, C. Crepeau, and I. Damgard. Multiparty unconditionally secure protocols. In Proc. 20th 
Symp. on Theory of Computing 88, pp. 11-19. 

Y. Dodis, S. Halevi, and T. Rabin. A cryptographic solution to a game theoretic problem. In Crypto 
2000, pp. 112-130, 2000. LNCS No. 1880. 

FM. Forges. An approach to communication equilibria. Econometrica, 54(6):1375-85, 1986. 

J. Feigenbaum and S. Shenker. Distributed algorithmic mechanism design: Recent results and future 
directions. In Proc. 6th Intl. Wkshp. Disc. Algo. Meth. Mobile Comp. Comm., pp. 1-13. ACM 
Press, 2002. 


BIBLIOGRAPHY 205 


D. Gerardi. Unmediated communication in games with complete and incomplete information. J. 
Econ. Theo., 114:104,131, 2004. 

O. Goldreich. Foundations of Cryptography: Volume 2. Cambridge University Press, 2004. Prelimi- 
nary version http://philby.ucsd.edu/cryptolib.html/. 

O. Goldreich, S. Micali, and A. Wigderson. How to play any mental game. In Proc. 19th STOC, pp. 
218-229. ACM, 1987. 

S. Goldwasser and L. Levin. Fair computation of general functions in presence of immoral majority. 
In Crypto ’90, LNCS 537:77-93. 

S. Goldwasser and S. Micali. Probabilistic encryption. J. Comp. Syst. Sci., 28(2):270-299, April 
1984. 

S.D. Gordon and J. Katz. Rational secret-sharing, revisited. In Sth Conf. Sec. Crypto. Networks, 2006. 
Updated version available at http://eprint.iacr.org/2006/142. 

J. Halpern and V. Teague. Rational secret-sharing and multiparty computation. In Proc. of 36th STOC, 
pp. 623-632. ACM Press, 2004. 

S. Izmalkov, M. Lepinski, and S. Micali. Rational secure computation and ideal mechanism design. 
In Proc. of 46th Fdns. of Computer Science, pp. 585-595, 2005. 

M. Lepinksi, S. Micali, and A. Shelat. Collusion-free protocols. In Proc. 37th Ann. ACM Symp. Theo. 
Comp., pp. 543-552. ACM Press, 2005. 
M. Lepinski, S. Micali, C. Peikert, and A. Shelat. Completely fair sfe and coalition-safe cheap talk. 
In PODC ’04: Proc. 23rd Annual ACM Symp. Princ. Dist. Comp., pp. 1-10. ACM Press, 2004. 
A. Lysyanskaya and N. Triandopoulos. Rationality and adversarial Behavior in Multi-Party Compu- 
tation. In Crypto 2006, 2006. 

R. McGrew, R. Porter, and Y. Shoham. Towards a general theory of non-cooperative computation 
(extended abstract). In Theo. Aspects of Rationality and Knowledge IX, 2003. 

S. Micali and P. Rogaway. Secure computation. In Crypto ’91, LNCS 576:392-404, 1991. 

J. Nash. Non-cooperative games. Annals of Math., 54:286—295, 1951. 

M.J. Osborne and A. Rubinstein. A Course in Game Theory. MIT Press, 1999. 

T. Rabin and M. Ben-Or. Verifiable secret-sharing and multiparty protocols with honest majority. In 
Proc. 21st Symp. on Theory of Computing, pp. 73-85. ACM, 1989. 

A. Shamir. How to share a secret. Comm. ACM, 22:612-613, 1979. 

Y. Shoham and M. Tennenholtz. Non-cooperative computation: Boolean functions with correctness 
and exclusivity. Theor. Comput. Sci., 343(1—2):97-113, 2005. 

V. Teague. Selecting correlated random actions. In Financial Cryptography, LNCS 3110:181-195. 
Springer, 2004. 

A. Urbano and J.E. Vila. Computational complexity and communication: Coordination in two-player 
games. Econometrica, 70(5):1893-1927, 2002. 

A. Urbano and J.E. Vila. Computationally restricted unmediated talk under incomplete information. 
Econ. Theory, 23:283-320, 2004. 

A.C. Yao. Protocols for secure computations. In Proc. Fdns. of Computer Science 82, pp. 160-164, 
IEEE, 1982. 


Algorithmic Mechanism 
Design 


CHAPTER 9 


Introduction to Mechanism 
Design (for Computer Scientists) 


Noam Nisan 


Abstract 


We give an introduction to the micro-economic field of Mechanism Design slightly biased toward a 
computer-scientist’s point of view. 


9.1 Introduction 


Mechanism Design is a subfield of economic theory that is rather unique within eco- 
nomics in having an engineering perspective. It is interested in designing economic 
mechanisms, just like computer scientists are interested in designing algorithms, pro- 
tocols, or systems. It is best to view the goals of the designed mechanisms in the 
very abstract terms of social choice. A social choice is simply an aggregation of the 
preferences of the different participants toward a single joint decision. Mechanism 
Design attempts implementing desired social choices in a strategic setting — assuming 
that the different members of society each act rationally in a game theoretic sense. 
Such strategic design is necessary since usually the preferences of the participants are 
private. 

This high-level abstraction of aggregation of preferences may be seen as a common 
generalization of a multitude of scenarios in economics as well as in other social 
settings such as political science. Here are some basic classic examples: 


¢ Elections: In political elections each voter has his own preferences between the different 
candidates, and the outcome of the elections is a single social choice. 

¢ Markets: Classical economic theory usually assumes the existence and functioning of 
a “perfect market.” In reality, of course, we have only interactions between people, gov- 
erned by some protocols. Each participant in such an interaction has his own preferences, 
but the outcome is a single social choice: the reallocation of goods and money. 

¢ Auctions: Generally speaking, the more buyers and sellers there are in a market, the 
more the situation becomes close to the perfect market scenario. An extreme opposite 


209 


210 INTRODUCTION TO MECHANISM DESIGN (FOR COMPUTER SCIENTISTS) 


case is where there is only a single seller — an auction. The auction rules define the social 
choice: the identity of the winner. 

¢ Government policy: Governments routinely have to make decisions that affect a multi- 
tude of people in different ways: Should a certain bridge be built? How much pollution 
should we allow? How should we regulate some sector? Clearly each citizen has a 
different set of preferences but a single social choice is made by the government. 


As the influence of the Internet grew, it became clear that many scenarios happening 
there can also be viewed as instances of social choice in strategic settings. The main 
new ingredient found in the Internet is that it is owned and operated by different 
parties with different goals and preferences. These preferences, and the behavior they 
induce, must then be taken into account by every protocol in such an environment. The 
protocol should thus be viewed as taking the preferences of the different participants 
and aggregating them into a social choice: the outcome of the run of the protocol. 

Conceptually, one can look at two different types of motivations: those that use 
economics to solve computer science issues and those that use computer science to 
solve economic issues: 


¢ Economics for CS: Consider your favorite algorithmic challenge in a computer network 
environment: routing of messages, scheduling of tasks, allocation of memory, etc. When 
running in an environment with multiple owners of resources or requests, this algorithm 
must take into account the different preferences of the different owners. The algorithm 
should function well assuming strategic selfish behavior of each participant. Thus we 
desire a Mechanism Design approach for a multitude of algorithmic challenges — leading 
to a field that has been termed Algorithmic Mechanism Design. 

¢ CS for economics: Consider your favorite economic interaction: some type of market, 
an auction, a supply chain, etc. As the Internet becomes ubiquitous, this interaction will 
often be implemented over some computerized platform. Such an implementation en- 
ables unprecedented sophistication and complexity, handled by hyperrationally designed 
software. Designing these is often termed Electronic Market Design. 


Thus, both Algorithmic Mechanism Design and Electronic Market Design can be 
based upon the field of Mechanism Design applied in complex algorithmic settings. 

This chapter provides an introduction to classical Mechanism Design, intended for 
computer scientists. While the presentation is not very different from the standard 
economic approach, it is somewhat biased toward a worst-case (non-Bayesian) point 
of view common in computer science. 

Section 9.2 starts with the general formulation of the social choice problem, points 
out the basic difficulties formulated by Arrow’s famous impossibility results, and 
deduces the impossibility of a general strategic treatment, i.e. of Mechanism Design in 
the general setting. Section 9.3 then considers the important special case where “money” 
exists, and describes a very general positive result, the incentive-compatible Vickrey— 
Clarke-Grove mechanism. Section 9.4 puts everything in a wider formal context of 
implementation in dominant strategies. Section 9.5 provides several characterizations 
of dominant strategy mechanisms. All the sections up to this point have considered 
dominant strategies, but the prevailing economic point of view is a Bayesian one that 
assumes a priori known distributions over private information. Section 9.6 introduces 
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this setting and the notion of Bayesian-Nash equilibrium that fits it. All the treatment 
in this chapter is in the very basic “private value” model, and Section 9.7 shortly points 
out several extensions to the model. Finally, Section 9.8 provides bibliographic notes 
and references. 


9.2 Social Choice 


This section starts with the general social choice problem and continues with the 
strategic approach to it. The main message conveyed is that there are unavoidable 
underlying difficulties. We phrase things in the commonly used terms of political 
elections, but the reader should keep in mind that the issues are abstract and apply to 
general social choice. 


9.2.1 Condorcet’s Paradox 


Consider an election with two candidates, where each voter has a preference for one 
of them. If society needs to jointly choose one of the candidates, intuitively it is clear 
that taking a majority vote would be a good idea. But what happens if there are three 
candidates? In 1785, The Marquis de Condorcet pointed out that the natural application 
of majority is problematic: consider three candidates — a, b, and c — and three voters 
with the following preferences: 


(i) a>| b> ( 
(ii) b>2c>2a 
(iii) Cr>3a>3 b 


(The notation a >; b means that voter i prefers candidate a to candidate b.) Now, 
notice that a majority of voters (1 and 3) prefer candidate a to candidate b. Similarly, 
a majority (1 and 2) prefers b to c, and, finally, a majority (2 and 3) prefers c to a. The 
joint majority choice is thus a > b > c > a which is not consistent. In particular for 
any candidate that is jointly chosen, there will be a majority of voters who would want 
to change the chosen outcome. 

This immediately tells us that in general a social choice cannot be taken simply 
by the natural system of taking a majority vote. Whenever there are more than two 
alternatives, we must design some more complex “voting method” to undertake a social 
choice. 


9.2.2 Voting Methods 


A large number of different voting methods — ways of determining the outcome of such 
multicandidate elections — have been suggested. Two of the simpler ones are plurality 
(the candidate that was placed first by the largest number of voters wins) and Borda 
count (each candidate among the n candidates gets n — i points for every voter who 
ranked him in place 7, and the candidate with most points wins). Each of the suggested 
voting methods has some “nice” properties but also some problematic ones. 

One of the main difficulties encountered by voting methods is that they may encour- 
age strategic voting. Suppose that a certain voter’s preferences are a >; b >; c, but he 
knows that candidate a will not win (as other voters hate him). Such a voter may be 
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motivated to strategically vote for b instead of a, so that b is chosen which he prefers 
to c. Such strategic voting is problematic as it is not transparent, depends closely on 
the votes of the other voters, and the interaction of many strategic voters is complex. 
The main result of this section is the Gibbard—Satterthwaite theorem that states that 
this strategic vulnerability is unavoidable. We will prove the theorem as a corollary of 
Atrow’s impossibility theorem that highlights the general impossibility of designing 
voting methods with certain natural good desired properties. 

Formally, we will consider a set of alternatives A (the candidates) and a set of n 
voters J. Let us denote by L the set of linear orders on A (L is isomorphic to the set 
of permutations on A). Thus for every < € L, < is a total order on A (antisymmetric 
and transitive). The preferences of each voter i are formally given by >; € L, where 
a >; b means that i prefers alternative a to alternative b. 


Definition 9.1 
¢ A function F : L” — Lis called a social welfare function. 


¢ A function f : L” > A is called a social choice function. 


Thus a social welfare function aggregates the preferences of all voters into acommon 
preference, i.e., into a total social order on the candidates, while a social choice function 
aggregates the preferences of all voters into a social choice of a single candidate. 
Arrow’s theorem states that social welfare functions with “nice” properties must be 
trivial in a certain sense. 


9.2.3 Arrow’s Theorem 


Here are some natural properties desired from a social welfare function. 


Definition 9.2 


¢ <A social welfare function F satisfies unanimity if for every < € L, F(K,...,<)= 
<. That is, if all voters have identical preferences then the social preference is the 
same. 


e Voter i is a dictator in social welfare function F if for all <~; ...~, €L, 
F(*1,..., <n) = <;. The social preference in a dictatorship is simply that of the 
dictator, ignoring all other voters. F is not a dictatorship if no i is a dictator in it. 


¢ A social welfare function satisfies independence of irrelevant alternatives if the 
social preference between any two alternatives a and b depends only on the voters’ 
preferences between a and b. Formally, for every a,b € A and every <j,..., 
~n, ~),.-., <), € L, if we denote ~« = F(<j,..., <,) and <’ = F(<',..., <),) 
then a <; b =a ~;, b for alli implies that a <« b & a ~<’ b. 


The first two conditions are quite simple to understand, and we would certainly want 
any good voting method to satisfy the unanimity condition and not to be a dictatorship. 
The third condition is trickier. Intuitively, indeed, independence of irrelevant alterna- 
tives seems quite natural: why should my preferences about c have anything to do with 
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the social ranking of a and b? More careful inspection will reveal that this condition in 
some sense captures some consistency property of the voting system. As we will see, 


lack of such consistency enables strategic manipulation. 


Theorem 9.3 (Arrow) Every social welfare function over a set of more than 
2 candidates (\|A| > 3) that satisfies unanimity and independence of irrelevant 
alternatives is a dictatorship. 


Over the years a large number of proofs have been found for Arrow’s theorem. Here 
is a short one. 


PROOF For the rest of the proof, fix F that satisfies unanimity and independence 
of irrelevant alternatives. We start with a claim showing that the same social 
ranking rule is taken within any pair of alternatives. 


Claim (pairwise neutrality) Let >|,...,>, and >{,...,>/, be two player 
profiles such that for every player i,a >; b<c>jd.Thena>b<#c>'d, 
where >= F(H4,.6355 >) and > SPO as XO) 


By renaming, we can assume without loss of generality that a > b and that 
c # b. Now we merge each >; and >’ into a single preference >; by putting c 
just above a (unless c = a) and d just below b (unless d = b) and preserving the 
internal order within each of the pairs (a, b) and (c, d). Now using unanimity, we 
have that c > a and b > d, and by transitivity c > d. This concludes the proof of 
the claim. 

We now continue with the proof of the theorem. Take any a #4 b € A, and 
for every 0 <i <n define a preference profile z' in which exactly the first i 
players rank a above b, i.e., in miea> ; 6 j <i (the exact ranking of the other 
alternatives does not matter). By unanimity, in F(°), we have b > a, while in 
F(a”) we have a > b. By looking at m°,a',..., 7", at some point the ranking 
between a and b flips, so for some i* we have that in F(x' —!), b > a, while in 
F(t’), a > b. We conclude the proof by showing that i* is a dictator. 


Claim Take anyc#4d€A.Ifc >j« dthenc > d where >= F(>,..., >n). 
Take some alternative e which is different from c and d. For i < i* move e 
to the top in >;, for i > i* move e to the bottom in >;, and for i* move e so 
that c >;« e >;« d — using independence of irrelevant alternatives we have not 
changed the social ranking between c and d. Now notice that players’ preferences 
for the ordered pair (c, e) are identical to their preferences for (a, b) in ', but 
the preferences for (e, d) are identical to the preferences for (a, b) in a’ —! and 
thus using the pairwise neutrality claim, socially c > e and e > d, and thus by 
transitivity c > d. 


9.2.4 The Gibbard-Satterthwaite Theorem 


It turns out that Arrow’s theorem has devastating strategic implications. We will study 
this issue in the context of social choice functions (rather than social welfare functions 


as we have considered until now). Let us start by defining strategic manipulations. 
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Definition 9.4 A social choice function f can be strategically manipulated by 
voter i if for some <),..., <, € L and some ~<; € L we have that a <; a’ where 
C= FAIS ea Sie) A a =f is xg Sein Spy Pa As, Voter 
that prefers a’ to a can ensure that a’ gets socially chosen rather than a by 
strategically misrepresenting his preferences to be <; rather than <;. f is called 


incentive compatible if it cannot be manipulated. 
The following is a more combinatorial point of view of the same notion. 


Definition 9.5 A social choice function f is monotone if f(«,...,<j,..., 
<n) =af#a'= f(K,...,~<),...,<,) implies that a’ <; a and a <’, a’. That 
is, if the social choice changed from a to a’ when a single voter i changed his 
vote from <; to <’ then it must be because he switched his preference between a 
and a’. 


Proposition 9.6 A social choice function is incentive compatible if and only if 
it is monotone. 


PROOF Take <,..., <j-1, <j41,---; <n out of the quantification. Now, logi- 
cally, “NOT monotone between <; and <’” is equivalent to “A voter with pref- 
erence < can strategically manipulate f by declaring <’” OR “A voter with 
preference ~’ can strategically manipulate f by declaring <”. 


The obvious example of an incentive compatible social choice function over two 
alternatives is taking the majority vote between them. The main point of this section 
is, however, that when the number of alternatives is larger than 2, only trivial social 
choice functions are incentive compatible. 


Definition 9.7 Voter i is a dictator in social choice function f if for all <, 
wie3~, EL, Vb 4a, a>; b= fl(K,...,~n) =a. f is called a dictatorship 
if some i is a dictator in it. 


Theorem 9.8 (Gibbard-Satterthwaite) Let f be an incentive compatible so- 
cial choice function onto A, where |A| > 3, then f is a dictatorship. 


Note the requirement that f is onto, as otherwise the bound on the size of A has 
no bite. To derive the theorem as a corollary of Arrow’s theorem, we will construct a 
social welfare function F from the social choice function f. The idea is that in order 
to decide whether a < b, we will “move” a and b to the top of all voters’ preferences, 
and then see whether f chooses a or b. Formally, 


Definition 9.9 


* Notation: Let SC A and < € L. Denote by <* the order obtained by moving 
all alternatives in S to the top in <. Formally, for a, b € S,a <5 b sa <b; for 
a,b ¢ S,alsoa <° b & a < b; but fora ¢g S andb € S,a <° b. 
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¢ The social welfare function F that extends the social choice function f is defined 
by F(<|,..., <,) =<, where a < b iff fe, Pees ey = b. 


We first have to show that F is indeed a social welfare function, i.e., that it is 
antisymmetric and transitive. 


Lemma 9.10 = /f f is an incentive compatible social choice function onto A then 
the extension F is a social welfare function. 


To conclude the proof of the theorem as a corollary of Arrow’s, it then suffices to 
show: 


Lemma 9.11 /f f is an incentive compatible social choice function onto A, 
which is not a dictatorship then the extension F satisfies unanimity and indepen- 
dence of irrelevant alternatives and is not a dictatorship. 


PROOF OF LEMMAS 9.10 AND 9.11 We start with a general claim which holds 
under the conditions on f: 


Claim: For any <;,..., <, and any S, kee Shey 2) eS. 
Take some a € S and since f is onto, for some <{,..., <),, f(<j,-.-.~<j,) = 
a. Now, sequentially, fori = 1,...,n, change <; to wee We claim that at no point 


during this sequence of changes will f output any outcome b ¢ S. At every stage 
this is simply due to monotonicity since b < r a’ for a’ € S being the previous 
outcome. This concludes the proof of the claim. 

We can now prove all properties needed for the two lemmas: 


* Antisymmetry is implied by the claim since f(<'""), ..., <!°"!) € {a, D}. 


¢ Transitivity: assume for contradiction that a ~ b ~ c ~ a (where ~ = F(<,, 


.++5<n)). Take S = {a,b,c} and using the claim assume without loss of gen- 
erality that kece ..., <3) =a. Sequentially changing <8 to Pee for each i, 
monotonicity of f implies that also f(<'',..., <4") = a, and thus a > b. 

{a,b} 


¢ Unanimity: If for all i, b <; a, then (<;")'? = gee and thus by the claim 


fe Sa 
¢ Independence of irrelevant alternatives: Iffor alli, b <; a <> b <’, a, then f ae 
op <= reas ..., <i@?l) since when we, sequentially for all i, flip 
Pe into Aaa the outcome does not change because of monotonicity and the 
claim. 


¢ Nondictatorship: obvious. 


The Gibbard—Satterthwaite theorem seems to quash any hope of designing incentive 
compatible social choice functions. The whole field of Mechanism Design attempts 
escaping from this impossibility result using various modifications in the model. The 
next section describes how the addition of “money” offers an escape route. Chapter 10 
offers other escape routes that do not rely on money. 
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9.3 Mechanisms with Money 


In the previous section, we modeled a voter’s preference as an order on the alternatives. 
a >; b implies that i prefers a to b, but we did not model “by how much” is a 
preferred to b. “Money” is a yardstick that allows measuring this. Moreover, money 
can be transferred between players. The existence of money with these properties is an 
assumption, but a fairly reasonable one in many circumstances, and will allow us to do 
things that we could not do otherwise. 

Formally, in this section we redefine our setting. We will still have a set of alternatives 
A and a set of n players J (which we will no longer call voters). The preference of 
a player i is now given by a valuation function v; : A > NR, where v;(a) denotes the 
“value” that i assigns to alternative a being chosen. This value is in terms of some 
currency; i.e., we assume that if a is chosen and then player i is additionally given 
some quantity m of money, then i’s utility is u; = v;(a) +m, this utility being the 
abstraction of what the player desires and aims to maximize. Utilities of this form 
are called quasilinear preferences, denoting the separable and linear dependence on 
money. 


9.3.1 Vickrey’s Second Price Auction 


Before we proceed to the general setting, in this subsection we study a basic example: 
a simple auction. Consider a single item that is auctioned for sale among n players. 
Each player i has a scalar value w,; that he is “willing to pay” for this item. More 
specifically, if he wins the item, but has to pay some price p for it, then his utility is 
w; — p, while if someone else wins the item then 7’s utility is 0. Putting this scenario 
into the terms of our general setting, the set of alternatives here is the set of possible 
winners, A = {i—wins|i € I}, and the valuation of each bidder i is v;(i—wins) = w; 
and v;(j—wins) = 0 for all 7 #7. A natural social choice would be to allocate the item 
to the player who values it highest: choose i—wins, where i = argmax ; w;. However, 
the challenge is that we do not know the values w,; but rather each player knows his 
own value, and we want to make sure that our mechanism decides on the allocation — 
the social choice — in a way that cannot be strategically manipulated. Our degree of 
freedom is the definition of the payment by the winner. 

Let us first consider the two most natural choices of payment and see why they do 
not work as intended: 


¢ No payment: In this version we give the item for free to the player with highest w;. 
Clearly, this method is easily manipulated: every player will benefit by exaggerating his 
w;, reporting a much larger w} >> w, that can cause him to win the item, even though 
his real w; is not the highest. 

¢ Pay your bid: An attempt of correction will be to have the winner pay the declared bid. 
However, this system is also open to manipulation: a player with value w; who wins 
and pays w; gets a total utility of 0. Thus it is clear that he should attempt declaring 
a somewhat lower value w; < w; that still wins. In this case he can still win the item 
getting a value of w; (his real value) but paying only the smaller w; (his declared value), 
obtaining a net positive utility u; = w; — w; > 0. What value w} should i bid then? 
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Well, if i knows the value of the second highest bid, then he should declare just above 
it. But what if he does not know? 


Here is the solution. 


Definition 9.12 Vickrey’s second price auction: Let the winner be the player 
i with the highest declared value of w;, and let i pay the second highest declared 
bid p* = maxj;z; wj. 


Now it turns out that manipulation never can increase any players’ utility. Formally, 


Proposition 9.13 (Vickrey) For every w),..., W, and every w’, Let u; be i’s 
utility if he bids w; and u’ his utility if he bids w;. Then, uj; > u'.. 


PROOF Assume that by saying w; he wins, and that the second highest (reported) 
value is p*, thenu; = w; — p* > 0. Now, for an attempted manipulation w; > p*, 
i would still win if he bids w} and would still pay p*, thus uv; = u;. On the other 
hand, for w; < p*,i would lose so u, = 0 < uj. 

If i loses by bidding w;, then u; = 0. Let j be the winner in this case, and 
thus w; > w;. For w; < w;, i would still lose and so u’ = 0 = u;. For w} > 
w;, 7 would win, but would pay w;, thus his utility would be ui = w; — w; < 
0= uj. 


This very simple and elegant idea achieves something that is quite remarkable: 
it reliably computes a function (argmax) of m numbers (the w;’s) that are each 
held secretly by a different self-interested player! Taking a philosophical point of 
view, this may be seen as the mechanics for the implementation of Adam Smith’s 
invisible hand: despite private information and pure selfish behavior, social wel- 
fare is achieved. All the field of Mechanism Design is just a generalization of this 
possibility. 


9.3.2 Incentive Compatible Mechanisms 


Ina world with money, our mechanisms will not only choose a social alternative but will 
also determine monetary payments to be made by the different players. The complete 
social choice is then composed of the alternative chosen as well as of the transfer 
of money. Nevertheless, we will refer to each of these parts separately, calling the 
alternative chosen the social choice, not including in this term the monetary payments. 

Formally, a mechanism needs to socially choose some alternative from A, as well 
as to decide on payments. The preference of each player i is modeled by a valuation 
function v; : A > NR, where v; € V;. Throughout the rest of this chapter, V; C RA isa 
commonly known set of possible valuation functions for player 7. 

Starting at this point and for the rest of this chapter, it will be convenient to use the 
following standard notation. 
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Notation Let v =(v,...,v,) be an n-dimensional vector. We will denote 
the (n — 1)-dimensional vector in which the i’th coordinate is removed by 
vV_j = (V1, ..., Vi-1, Vidi, +--+, Un). Thus we have three equivalent notations: v = 
(V1, .--, Un) = (v;, v_;). Similarly, for V = V; x --- x V,, we will denote V_; = 
Vix +++ x Vi-y x Via x --+ x V,,. Similarly we will use t_;, x_;, X_;, etc. 


Definition 9.14 <A (direct revelation) mechanism is a social choice function 
f: Vi x-+-++x V, > A and a vector of payment functions p,,..., Py, where 
pi: Vi X +--+ xX V, — Ris the amount that player i pays. 


The qualification “direct revelation” will become clear in Section 9.4, where we will 
generalize the notion of a mechanism further. We are now ready for the key definition 
in this area, incentive compatibility also called strategy-proofness or truthfulness. 


Definition 9.15 A mechanism (f, p),..., Py) is called incentive compatible if 
for every playeri,every v; € Vj,..., U, € V, andevery v; € V;,if we denote a = 
f(v;, v_;) and a’ = f(v;, v_;), then v;(a) — p;(v;, v_j) = vj(a’) — pj(v}, v_i). 


Intuitively this means that player i whose valuation is v; would prefer “telling the 
truth” v; to the mechanism rather than any possible “lie” v;, since this gives him higher 
(in the weak sense) utility. 


9.3.3. Vickrey—Clarke—Groves Mechanisms 


While in the general setting without money, as we have seen, nothing nontrivial is 
incentive compatible, the main result in this setting is positive and provides an incentive 
compatible mechanism for the most natural social choice function: optimizing the social 
welfare. The social welfare of an alternative a € A is the sum of the valuations of all 
players for this alternative, 5°; v;(a). 


Definition 9.16 A mechanism (f/f, p1,..., Pn) is called a Vickrey—Clarke— 
Groves (VCG) mechanism if 


° f(U,..., Un) € argmax,c,4 > u;(a); that is, f maximizes the social welfare, and 
¢ for some functions h;,...,,, where h; : V_;j > % (e., h; does not depend 
on v;), we have that for all vj € Vi,..., Un © Vat pi(vi,.--; Un) = hi(v_j) — 


ii vj(f(u1, re) Un)). 


The main idea lies in the term — payer vj(f(u1,.--, Un)), which means that each 
player is paid an amount equal to the sum of the values of all other players. When this 
term is added to his own value v;(f(v1,..., U,)), the sum becomes exactly the total 
social welfare of f(v,,...,U,). Thus this mechanism aligns all players’ incentives 
with the social goal of maximizing social welfare, which is exactly archived by telling 
the truth. The other term in the payment h;(v;) has no strategic implications for player 
i since it does not depend, in any way, on what he says, and thus from player i’s point 
of view it is just a constant. Of course, the choice of h; does change significantly how 
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much money is paid and in which direction, but we will postpone this discussion. What 
we have just intuitively explained is as follows. 


Theorem 9.17 (Vickrey—Clarke—Groves) Every VCG mechanism is incentive 
compatible. 


Let us prove it formally. 


PROOF Fix i, v_;, v;, and v;. We need to show that for player i with valuation 
v;, the utility when declaring v; is not less than the utility when declaring vj. 
Denote a = f(v;, v_;) and a’ = f(v;, v_;). The utility of i, when declaring v;, 
is vj(a) + pase vj(a) — h;(v_;), but when declaring v; is v;(a’) +  iei vj(a’) — 
h,(v_;). But since a = f(v;, v_;) maximizes social welfare over all alternatives, 
v;(a) + ivi vj(a) > v;(a’)+ ivi v,;(a’) and thus the same inequality holds 
when subtracting the same term /;(v_;) from both sides. 


9.3.4 Clarke Pivot Rule 


Let us now return to the question of choosing the “right” h;’s. One possibility is 
certainly choosing h; = 0. This has the advantage of simplicity but usually does not 
make sense since the mechanism pays here a great amount of money to the players. 
Intuitively we would prefer that players pay money to the mechanism, but not more 
than the gain that they get. Here are two conditions that seem to make sense, at least in 
a setting where all valuations are nonnegative. 


Definition 9.18 
¢ A mechanism is (ex-post) individually rational if players always get nonneg- 
ative utility. Formally if for every v1,...,v, we have that v;(f(v1,...,Un)) — 


Pi(V1,.--, Un) = O. 
¢ A mechanism has no positive transfers if no player is ever paid money. Formally 
if for every v,,..., Vv, and every i, p;(v1,..., U,) = 0. 


The following choice of h;’s provides the following two properties. 


Definition 9.19 (Clarke pivot rule) The choice /;(v_;) = maxpe, >> isi v;(b) 
is called the Clarke pivot payment. Under this rule the payment of player i is 
Pi(V1, «+5 Un) = Maxy Dj; vi(D) — D0 jz; vila), where a = f(v1,..., Un). 


Intuitively, i pays an amount equal to the total damage that he causes the other 
players — the difference between the social welfare of the others with and without i’s 
participation. In other words, the payments make each player internalize the externali- 
ties that he causes. 


Lemma 9.20 A VCG mechanism with Clarke pivot payments makes no positive 
transfers. If v;(a) = 0 for every v; € V; and a é A then it is also individually 
rational. 
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PROOF Let a= f(v,...,U,) be the alternative maximizing ae vj(a) and b 
be the alternative maximizing )° gil j(b). To show individual rationality, the 
utility of player i is v;(a) + eee vj(a) = ii vj(b) = pale vila) = me vj(b) = 
0, where the first inequality is since v;(b) > 0 and the second is since a was chosen 
as to maximize ) > ji (a). To show no positive transfers, note that p;(v1,..., Un) = 
DV ixi v;(b) — ii v;(a) > 0, since b was chosen as to maximize ivi v;(d). 


As stated, the Clarke pivot rule does not fit many situations where valuations are 
negative; i.e., when alternatives have costs to the players. Indeed, with the Clarke pivot 
tule, players always pay money to the mechanism, while the natural interpretation in 
case of costs would be the opposite. The spirit of the Clarke pivot rule in such cases 
can be captured by a modified rule that chooses b as to maximize the social welfare 
“when i does not participate” where the exact meaning of this turns out to be quite 
natural in most applications. 


9.3.5 Examples 
9.3.5.1 Auction of a Single Item 


The Vickrey auction that we started our discussion with is a special case of a VCG 
mechanism with the Clarke pivot rule. Here A = {i—wins|i € J}. Each player has 
value 0 if he does not get the item, and may have any positive value if he does win the 
item, thus V; = {v;|v;(i—-wins) > 0 and Vj 4 i, v;(j—wins) = 0}. Notice that finding 
the player with highest value is exactly equivalent to maximizing }°; v;(i) since only 
a single player gets nonzero value. VCG payments using the Clarke pivot rule give 
exactly Vickrey’s second price auction. 


9.3.5.2 Reverse Auction 


In a reverse auction (procurement auction) the bidder wants to procure an item 
from the bidder with lowest cost. In this case the valuation spaces are given by 
V; = {v;|v;(i—wins) < 0 and Vj 4 i v;(j—wins) = 0}, and indeed procuring the item 
from the lowest cost bidder is equivalent to maximizing the social welfare. The natural 
VCG payment rule would be for the mechanism to pay to the lowest bidder an amount 
equal to the second lowest bid, and pay nothing to the others. This may be viewed as 
capturing the spirit of the pivot rule since the second lowest bid is what would happen 
“without i.” 


9.3.5.3 Bilateral Trade 


In the bilateral trade problem a seller holds an item and values it at some 0 < v, < 1 
and a potential buyer values it at some 0 < vu, < 1. (The constants 0 and 1 are ar- 
bitrary and may be replaced with any commonly known constants 0 < vy < vp.) 
The possible outcomes are A = {no-trade, trade} and social efficiency implies that 
trade is chosen if vp, > vs and no-trade if vs > vp. Using VCG payments and de- 
creeing that no payments be made in case of no-trade, implies that in case of trade 
the buyer pays vs and the seller is paid vy. Notice that since in this case vp > vs, 
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the mechanism subsidizes the trade. As we will see below in Section 9.5.5, this is 
unavoidable. 


9.3.5.4 Multiunit Auctions 


In a multiunit auction, k identical units of some good are sold in an auction (where 
k <n). In the simple case each bidder is interested in only a single unit. In this case 
A = {S—wins|S C I, |S| =k}, and a bidder’s valuation v; gives some fixed value v* 
if i gets an item, i.e. v;(S) = v* if i € S and v;(S) = 0 otherwise. Maximizing social 
welfare means allocating the items to the k highest bidders, and in the VCG mecha- 
nism with the pivot rule, each of them should pay the k + 1’st highest offered price. 
(Losers pay 0.) 

Ina more general case, bidders may be interested in more than a single unit and have 
a different value for each number of units obtained. The next level of sophistication 
comes when the items in the auction are heterogeneous, and valuations can give a 
different value to each combination of items. This is called a combinatorial auction 
and is studied at length in Chapter 11. 


9.3.5.5 Public Project 


The government is considering undertaking a public project (e.g., building a bridge). 
The project has acommonly known cost C, and is valued by each citizen i at (a privately 
known) value v;. (We usually think that v; > 0, but the case of allowing v; < 0, 1.e., 
citizens who are hurt by the project is also covered.) Social efficiency means that 
the government will undertake this project iff }°; vj > C. (This is not technically a 
subcase of our definition of maximizing the social welfare, since our definition did 
not assume any costs or values for the designer, but becomes so by adding an extra 
player “government” whose valuation space is the singleton valuation, giving cost C 
to undertaking the project and 0 otherwise.) The VCG mechanism with the Clarke 
pivot rule means that a player i with v; > 0 will pay a nonzero amount only if he is 
pivotal: >¢ ,; vj < C but )); vj > C in which case he will pay pj = C — )) j4; 0j-(A 
player with v; < 0 will make a nonzero payment only if Dy izi v; > C but i vp<c 
in which case he will pay p; = >> igi Uj — C.) One may verify that )°; p; < C (unless 
>=; ¥; = C), and thus the payments collected do not cover the project’s costs. As we 
will see in Section 9.5.5, this is unavoidable. 


9.3.5.6 Buying a Path in a Network 


Consider a communication network, modeled as a directed graph G = (V, E), where 
each link e € E is owned by a different player, and has a cost c, > 0 if his link is 
used for carrying some message. Suppose that we wish to procure a communication 
path between two specified vertices s,t € V; i.e., the set of alternatives is the set of 
all possible s — ¢ paths in G, and player e has value 0 if the path chosen does not 
contain e and value —c, if the path chosen does contain e. Maximizing social welfare 
means finding the shortest path p (in terms of ee ce). A VCG mechanism that 
makes no payments to edges that are not in p, will pay to each eo € p the quantity 
eee p! Ce — Lice p—{eg} Cer Where p is the shortest s — ¢ path in G and p’ is the shortest 
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s — t path in G that does not contain the edge e (for simplicity, assume that G is 2-edge 
connected so such a p’ always exists). This corresponds to the spirit of the pivot rule 
since “without e” the mechanism can simply not use paths that contain e. 


9.4 Implementation in Dominant Strategies 


In this section our aim is to put the issue of incentive compatibility in a wider context. 
The mechanisms considered so far extract information from the different players by 
motivating them to “tell the truth.’ More generally, one may think of other, indirect, 
methods of extracting sufficient information from the participants. Perhaps one may 
devise some complex protocol that achieves the required social choice when players 
act strategically. This section will formalize these more general mechanisms, and the 
associated notions describing what happens when “players act strategically.” 

Deviating from the common treatment in economics, in this section we will describe 
a model that does not involve any distributional assumptions. Many of the classical 
results of Mechanism Design are captured in this framework, including most of the ex- 
isting applications in computational settings. In Section 9.6 we will add this ingredient 
of distributional assumptions reaching the general “Bayesian” models. 


9.4.1 Games with Strict Incomplete Information 


How do we model strategic behavior of the players when they are missing some of 
the information that specifies the game? Specifically in our setting a player does not 
know the private information of the other players, information that determines their 
preferences. The standard setting in Game Theory supposes on the other hand that the 
“rules” of the game, including the utilities of all players, are public knowledge. 

We will use a model of games with independent private values and strict incomplete 
information. Let us explain the terms: “independent private values” means that the 
utility of a player depends fully on his private information and not on any information 
of others as it is independent from his own information. Strict incomplete information 
is a (not completely standard) term that means that we will have no probabilistic 
information in the model. An alternative term sometimes used is “pre-Bayesian.” From 
a CS perspective, it means that we will use a worst case analysis over unknown 
information. So here is the model. 


Definition 9.21 A game with (independent private values and) strict incomplete 
information for a set of n players is given by the following ingredients: 
(i) For every player i, a set of actions X;. 


(ii) For every player i, a set of types T;. A value t; € T; is the private information 


that 7 has. 

(iii) For every player i, a utility function u; :T; x X, x +--+: x X;, — H, where 
uj(tj,X1,---,%Xn,) is the utility achieved by player i, if his type (private infor- 
mation) is 7;, and the profile of actions taken by all players is x1, ..., Xp. 


The main idea that we wish to capture with this definition is that each player i must 
choose his action x; when knowing 1; but not the other f;’s. Note that the t;’s do not 
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affect his utility, but they do affect how the other players behave. Thus the interplay 
between the different x;’s is more delicate than in “regular” games. The total behavior 
of player i in such a setting is captured by a function that specifies which action x; is 
taken for every possible type t; — this is termed a strategy. It is these strategies that we 
want to be in equilibrium. 


Definition 9.22 

¢ A strategy of a player i is a function s; : T; > Xj. 

¢ A profile of strategies s,,...,5, is an ex-post-Nash equilibrium if for every 
ti,...,t, we have that the actions s,(f,),..., 5,(t,) are in Nash equilibrium in 
the full information game defined by the #;’s. Formally: For alli, all t;,...,¢,, and 
all xi we have that u;(t;, Si(t;), S_;(t_;)) = u;(t;, Xs S_;(t_;)). 

¢ A strategy s; is a (weakly) dominant strategy if for every t; we have that the action 
S;(t;) is a dominant strategy in the full information game defined by ¢;. Formally: 
for all t;, all x_; and all x; we have that u;(t;, s;(t;), x-;) > ui(t;, xj, x-;). A profile 
S1,..., 5S, 18 called a dominant strategy equilibrium if each s; is a dominant strategy. 


Thus the notion of ex-post Nash requires that s;(t;) is a best response to s;(t_;) 
for every possible value of t_;, ie., without knowing anything about t_; but rather 
only knowing the forms of the other players’ strategies s_; as functions. The notion 
of dominant strategy requires that s;(t;) is a best response to any x_; possible, i.e., 
without knowing anything about t_; or about s_;. Both of these definitions seem too 
good to be true: how likely is it that a player has a single action that is a best response 
to all x_; or even to all s_;(t_;)? Indeed in usual cases one does not expect games with 
strict incomplete information to have any of these equilibria. However, in the context 
of Mechanism Design — where we get to design the game — we can sometimes make 
sure that they do exist. 

While at first sight the notion of dominant strategy equilibrium seems much stronger 
than ex-post Nash, this is only due to actions that are never used. 


Proposition 9.23 Let s1,..., 5, be an ex-post-Nash equilibrium of a game 
(Xin s Xp Diy og Thttlty os Up) Define XE = {siti eT} (ie. X} is the 
actual range of s; in X;), then 51, ..., 8, is adominant strategy equilibrium in the 


wame(X) (0485. 3s Tino gas Miss Ua): 


n? 


3 


PROOF Let x; = s;(t;) € X},x; € X;,andforevery j Aix; € X’,. By definition 
of x’, for every j 4 i, there exists ti € T; such that s;(t;) = x;. Since s1,..., Sp 
is an ex-post-Nash equilibrium, u;(t;, s;(t;), s-i(t_i)) > ui(ti, x}, s_;(t_i)), and as 
X_j = S_j(t_j) we get exactly uj(¢;, s;(t;), xi) = ui(ti, x/, x_;) as required in the 
definition of dominant strategies. 


9.4.2 Mechanisms 


We are now ready to formalize the notion of a general — nondirect revelation — mecha- 
nism. The idea is that each player has some private information t; € 7; that captures his 
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preference over a set of alternatives A; i.e., v;(t;, a) is the value that player i assigns to a 
when his private information is t;. We wish to “implement” some social choice function 
F:T,x---x T, — A that aggregates these preferences. We design a “mechanism” 
for this purpose: this will be some protocol for interaction with the players, specifying 
what each can “say” and what is done in each case. Formally, we can specify a set 
of possible actions X; for each player, an outcome function a: X; x---x X, >A 
that chooses an alternative in A for each profile of actions, and payment functions 
p:X,x--:x X, —> W that specify the payment of each player for every profile of 
actions. Now the players are put in a game with strict incomplete information and we 
may expect them to reach an equilibrium point (if such exists). 


Definition 9.24 

¢ A mechanism for n players is given by (a) players’ type spaces T),..., T,, (b) 
players’ action spaces X,..., X,, (c) an alternative set A, (d) players’ valuations 
functions v; : T; x A :—> XH, (e) an outcome function a: X; x--- x X, > A, 
and (f) payment functions pj,..., Pn, where pj : X; x --- x X, > NR. The game 
with strict incomplete information induced by the mechanism is given by using 
the types spaces 7;, the action spaces X;, and the utilities u;(t;,x1,...,%n) = 
Uj(t;, A(X], - ++, Xn)) — Pi(%1,---5Xn)- 

¢ The mechanism implements a social choice function f : 7, x--- x T, > A in 
dominant strategies if for some dominant strategy equilibrium s5;,...,5, of the 
induced game, where s; : 7; — X;, we have that forall t),...,t%, f(t,...,4) = 


a(si(ti),.-- Sn(tn))- 

e Similarly we say that the mechanism implements f in ex-post-equilibrium if for 
some ex-post equilibrium s;,...,5, of the induced game we have that for all 
ty.--stns fi, ---, tn) = a(51(th1), -- +, Sn(tn)). 


Clearly every dominant strategy implementation is also an ex-post-Nash implemen- 
tation. Note that our definition only requires that for some equilibrium f(t),...,t)) = 
a(s,(t1), .-., S,(t,)) and allows other equilibria to exist. A stronger requirement would 
be that all equilibria have this property, or stronger still, that only a unique equilibrium 
point exists. 


9.4.3 The Revelation Principle 


At first sight it seems that the more general definition of mechanisms will allow us 
to do more than is possible using incentive compatible direct revelation mechanisms 
introduced in Section 9.3. This turns out to be false: any general mechanism that imple- 
ments a function in dominant strategies can be converted into an incentive compatible 
one. 


Proposition 9.25 (Revelation principle) Jf there exists an arbitrary mecha- 
nism that implements f in dominant strategies, then there exists an incentive 
compatible mechanism that implements f. The payments of the players in the 
incentive compatible mechanism are identical to those, obtained at equilibrium, 
of the original mechanism. 
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PROOF The proof is very simple: the new mechanism will simply simulate 
the equilibrium strategies of the players. That is, Let s,,...,5, be a domi- 
nant strategy equilibrium of the original mechanism, we define a new direct 
revelation mechanism: f(t), ...,t) = a(si(t1),.- +. Sn(t)) and pi(ti, ..., th) = 
Di(S1(t1), .--, Sn(t,)). Now, since each s; is a dominant strategy for player i, 
then for every t;,x_;,x; we have that v;(t;, a(s;(t)), x-;)) — pi(si(t;), xi) = 
Uj (t;, a(X;, X_;)) — p;(x!, x_;). Thus in particular this is true for all x_; = s_j(t_;) 
and any x; = s;(t/), which gives the definition of incentive compatibility of the 
mechanism (f, p),..-. P,)- 


Corollary 9.26 If there exists an arbitrary mechanism that ex-post-Nash imple- 
ments f, then there exists an incentive compatible mechanism that implements 
f. Moreover, the payments of the players in the incentive compatible mechanism 
are identical to those, obtained in equilibrium, of the original mechanism. 


PROOF We take the ex-post implementation and restrict the action space of 
each player, as in Proposition 9.23, to those that are taken, for some input type, 
in the ex-post equilibrium 5), ..., 5,. Proposition 9.23 states that now s1,..., Sp 
is a dominant strategy equilibrium of the game with the restricted spaces, and 
thus the mechanism with the restricted action spaces is an implementation in 
dominant strategies. We can now invoke the revelation principle to get an incentive 
compatible mechanism. 


The revelation principle does not mean that indirect mechanisms are useless. In 
particular, general mechanisms may be adaptive (multiround), significantly reducing 
the communication (or computation) burden of the players or of the auctioneer relative 
to a nonadaptive direct mechanism. An example is the case of combinatorial auctions 
studied in Chapter 11. 


9.5 Characterizations of Incentive Compatible Mechanisms 


In Section 9.3 we saw how to implement the most natural social choice function: maxi- 
mization of the social welfare. The question that drives this section is: What other social 
choice functions can we implement? In economic settings, the main reasons for at- 
tempting implementations of other social choice functions are increasing the revenue or 
introducing some kind of fairness. In computerized settings there are many natural opti- 
mization goals and we would like to be able to implement each of them. For example, in 
scheduling applications, a common optimization goal is that of the “makespan” — com- 
pletion time of the last job. This is certainly a social choice function that is very different 
than maximizing the total social welfare — how can it be implemented? Another major 
motivation for social choice functions that do not maximize social welfare comes from 
computational considerations. In many applications the set of alternatives A is com- 
plex, and maximizing social welfare is a hard computational problem (NP-complete). 
In many of these cases there are computationally efficient algorithms that approximate 
the maximum social welfare. Such an algorithm in effect gives a social choice function 
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that approximates social welfare maximization, but is different from it. Can it be 
implemented? 

Chapter 12 and parts of Chapter 11 address these issues specifically. This section 
limits itself to laying the foundations by providing basic characterizations of imple- 
mentable social choice functions and their associated payments. 

Because of the revelation principle, we can restrict ourselves again to look at in- 
centive compatible mechanisms. Thus, in this section we revert to the notation used 
in Subsection 9.3.3: A mechanism M = (f, p1,..., Pn) over domain of preferences 
V, x --- x V, (V; © #4) is composed of a social choice function f : Vj x +--+ x Vy > 
A and payment functions p;,..., Pn, where p; : Vj x --- x V, — is the amount that 
player i pays. In the rest of the section we will provide characterizations of when such 
mechanisms are incentive compatible. 


9.5.1 Direct Characterization 


We start by stating explicitly the required properties from an incentive compatible 
mechanism. 


Proposition 9.27 A mechanism is incentive compatible if and only if it satisfies 
the following conditions for every i and every v_;: 

(i) The payment p; does not depend on v;, but only on the alternative chosen 
f(;, v_-;). That is, for every v_;, there exist prices pag € NK, for every a é€ A, 
such that for all v; with f(v;, v-;) = a we have that p(v;, v-i) = Pa. 

(ii) The mechanism optimizes for each player. That is, for every v;, we have that 
Sf (uj, v_-i) € argmax,(v;(a) — Pa), where the quantification is over all alterna- 
tives in the range of f(-, v_j). 


PROOF (if part) Denote a = f(v;, v_;), a’ = f(v;, v_i), Pa = p(vj, v_i), and 
Pa' = P(v;, v_;). The utility of i, when telling the truth, is vj(a) — pa, which 
is not less than the utility when declaring v;, v;(a’) — pa’, since the mechanism 
optimizes for i, i.e.,a = f(v;, v_;) € argmax,(v;(a) — Pa). 

(Only-if part; first condition) If for some v;, v}, f(v;, v-i;) = f(v;, v_;) but 
pi(v;, v-;) > pi(v;, v_;) then a player with type v; will increase his utility by 
declaring v;. 

(Only-if part; second condition) If f(v;, v_i) ¢ argmax,(v;(a) — pa), fix 
a’ € argmax,(v;(a) — pa) in the range of f(-,v_;), and thus for some vj, 
a’ = f(v;, v_j). Now a player with type v; will increase his utility by declar- 
ing v’. 


9.5.2 Weak Monotonicity 


The previous characterization involves both the social choice function and the payment 
functions. We now provide a partial characterization that only involves the social choice 
function. In Section 9.5.5 we will see that the social choice function usually determines 
the payments essentially uniquely. 
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Definition 9.28 A social choice function f satisfies Weak Monotonicity 
(WMON) if for alli, all v_; we have that f(v;, v_;) =a #4 b = f(v*, v_;) implies 
that v;(a) — v,;(b) = v;(a) — v}(b). 


That is, WMON means that if the social choice changes when a single player changes 
his valuation, then it must be because the player increased his value of the new choice 
relative to his value of the old choice. 


Theorem 9.29 [fa mechanism (f, pi, ..-., Pn) is incentive compatible, then f 
satisfies WMON. If all domains of preferences V; are convex sets (as subsets of 
an Euclidean space) then for every social choice function that satisfies WMON 
there exists payment functions p,,..., Py such that (f, pi,..-., Pn) is incentive 
compatible. 


The first part of the theorem is easy and we will bring it completely, the second part 
is quite involved, and will not be given here. It is known that WMON is not a sufficient 
condition for incentive compatibility in general nonconvex (more precisely, nonsimply 
connected) domains. 


PROOF (First part) Assume first that (f, p1,..., Py) iS incentive compatible, 
and fix i and v_; in an arbitrary manner. Proposition 9.27 implies the existence 
of fixed prices p, for alla € A (that do not depend on v;) such that whenever the 
outcome is a then bidder i pays exactly p,. Now assume f(v;, v_;) =a #Ab= 
f(v;, v_;). Since a player with valuation v; does not prefer declaring v; we have 
that vj(a) — Pa = v;(b) — pp. Similarly since a player with valuation v; does not 
prefer declaring v; we have that v;(a) — pa < v;(b) — pp. Subtracting the second 
inequality from the first, we get v;(a) — v;(b) > v;(a) — v}(b), as required. 


While WMON gives a pretty tight characterization of implementable social choice 
functions, it still leaves something to be desired as it is not intuitively clear what exactly 
the WMON functions are. The problem is that the WMON condition is a local condition 
for each player separately and for each v_; separately. Is there a global characterization? 
This turns out to depend intimately on the domains of preferences V;. For two extreme 
cases there are good global characterizations: when V;, is “unrestricted” i.e. V; = #4, 
and when V; is severely restricted as to be essentially single dimensional. These two 
cases are treated in the next two subsections below. The intermediate range where 
the V;’s are somewhat restricted, a range in which most computationally interesting 
problems lie is still wide open. More on this appears in Chapter 12. 


9.5.3 Weighted VCG 


It turns out that when the domain of preferences is unrestricted, then the only incentive 
compatible mechanisms are simple variations of the VCG mechanism. These variations 
allow giving weights to the players, weights to the alternatives, and allow restricting 
the range. The resulting social choice function is an “affine maximizer”: 
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Definition 9.30 A social choice function f is called an affine maximizer if 
for some subrange A’ C A, for some player weights w;,..., w, € Kt and for 
some outcome weights c, € Xt for every a € A’, we have that f(v;,..., Un) € 
argmax,c4/(Ca + >_; w;v;(a)). 


It is easy to see that VCG mechanisms can be generalized to affine maximizers: 


Proposition 9.31 Let f be an affine maximizer. Define for every i, 
Di, -- +5 Un) = hj(v_j) — ei j/Wi)v;(a) — Cq/w;, where h; is an arbitrary 
function that does not depend on v;. Then, (f, Pi, .-.-, Pn) is incentive compatible. 


PROOF First, we can assume wlog that h; = 0. The utility of player i if al- 
ternative a is chosen is v;(a) + yj ei(wj/widv;(a) + cq/w;. By multiplying by 
w; > 0, this expression is maximized when c, + Sy wv; (a) is maximized which 
is what happens when i reports v; truthfully. 


Roberts’ theorem states that for unrestricted domains with at least 3 possible out- 
comes, these are the only incentive compatible mechanisms. 


Theorem 9.32 (Roberts) Jf |A| > 3, f is onto A, V; = %4 for every i, and 
Cf, Pis-++s Pu) is incentive compatible then f is an affine maximizer. 


The proof of this theorem is not trivial and is given in Chapter 12. It is easy to see 
that the restriction |A| > 3 is crucial (as in Arrow’s theorem), since the case |A| = 2 
falls into the category of “single parameter” domains discussed below, for which there 
do exist incentive compatible mechanisms beyond weighted VCG. It remains open to 
what extent can the restriction of V; = 4 be relaxed. 


9.5.4 Single-Parameter Domains 


The unrestricted case V; = i“ basically means that the valuation space has full dimen- 
sionality. The opposite case is when the space V; is single-dimensional; i.e., there is 
a single real parameter that directly determines the whole vector v;. There are several 
possible levels of generality in which to formalize this, and we will consider one of 
intermediate generality that is simple and yet suffices for most applications. In our set- 
ting each bidder has a private scalar value for “winning,” with “losing” having value of 
0. This is modeled by some commonly known subset of winning alternatives W; C A. 
The main point is that all winning alternatives are equivalent to each other for player 
i; and similarly all losing outcomes are equivalent to each other. All the examples in 
Section 9.3.5 fall into this category. A simple example is an auction of one item where 
W; is the single outcome where 7 wins. A more complex example is the setting of 
buying a path in a network (Subsection 9.3.5.6), where W; is the set of all paths that 
contain edge i. 


Definition 9.33 A single parameter domain V; is defined by a (publicly known) 
W; C A and a range of values [r°, t']. V; is the set of v; such that for some 
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t9<t <t!,v,(a) =t,foralla € W, and v;(a) = Oforalla ¢ W;. In suchsettings 
we will abuse notation and use v; as the scalar f. 


For this setting it is quite easy to completely characterize incentive compatible 
mechanisms. 


Definition 9.34 A social choice function f on a single parameter domain is 
called monotone in v; if for every v_; and every v; < v; € 8 we have that 
f (vj, v-i) € W; implies that f(v;, v-;) € W;. That is, if valuation v; makes i 
win, then so will every higher valuation v; > v;. 


For a monotone function f, for every v_; for which player i can both win and lose, 
there is always a critical value below which i loses and above which he wins. For 
example, in a second price auction the critical value for each player is highest declared 
value among the other players. 


Definition 9.35 The critical value of a monotone social choice function f ona 
single parameter domain is cj(v_;) = SUP,,. ¢¢y,,»_)¢w, Vi- The critical value at v_; 
is undefined if {v;| f(v;, v-;) ¢ W;} is empty. 


We will call amechanism on a single parameter domain “normalized” if the payment 
for losing is always 0, i.e., for every v;, v_; such that f(v;, v_;) ¢ W; we have that 
Di(v;, v_;) = 0. It is not difficult to see that every incentive compatible mechanism 
may be easily turned into a normalized one, so it suffices to characterize normalized 
mechanisms. 


Theorem 9.36 A normalized mechanism (f, p\,..., Pn) on a Single parameter 

domain is incentive compatible if and only if the following conditions hold: 

(i) f is monotone in every vj. 

(ii) Every winning bid pays the critical value. (Recall that losing bids pay 0.) For- 
mally, For every i, v;, v-; such that f (v;, v-i) € W;, we have that p;(v;, v-;) = 
cj(v_;). (If c;(v_;) is undefined we require instead that for every v_;, there exists 
some value c;, such that p;(v;, v-;) = c; for all v; such that f(v;, v-i) € W;.) 


PROOF (If part) Fix 7, v_;, v;. For every declaration made by i, if he wins his 
utility is v; — c;(v_;) and if he loses his utility is 0. Thus he prefers winning if 
v; > c;(v_;) and losing if v; < c;(v_;), which is exactly what happens when he 
declares the truth. 

(Only-if part, first condition) If f is not monotone then for some v; > 1; 
we have that f(v;, v_;) loses while f(v;, v_;) wins and pays some amount p = 
pi(v;, v_;). Since a bidder with value v; is not better off bidding v; and losing we 
have that v; — p > 0. Since a bidder with value v; is not better off bidding v; and 
winning we have that v; — p < 0. Contradiction. 

(Only-if part, second condition) Assume that some winning v; pays p > c;(v_;) 
then, using Proposition 9.27, all winning bids will make the same payment, 
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including a winning v; with c;(v_;) < v; < p. But sucha bidder is better off losing 
which he can do by bidding some value v'°* < c(v_;). In the other direction if 
v; pays p < c(v_;) then a losing v; with c(v_;) > v; > p is better of wining and 
paying p, which will happen if he bids vj. 


Notice that this characterization leaves ample space for non-affine-maximization. 
For example we can implement social functions such as maximizing the euclidean norm 
argmax, )_, v;(a)* or maximizing the minimum value argmax, min; v;(a). Indeed in 
many cases this flexibility allows the design of computationally efficient approximation 
mechanisms for problems whose exact optimization is computationally intractable — 
an example is given in Chapter 12. 


9.5.5 Uniqueness of Prices 


This section has so far focused on characterizing the implementable social choice 
functions. What about the payment functions? It turns out that the payment function 
is essentially uniquely determined by the social choice function. “Essentially” means 
that if we take an incentive compatible mechanisms with payments p; and modify the 
payments to pj(v1,..., Un) = pi(vi,..-, Un) + Aj (v_;) for an arbitrary function h; that 
does not depend on v;, then incentive compatibility remains. It turns out that this is the 
only leeway in the payment. 


Theorem 9.37 = Assume that the domains of preference V; are connected sets 
in the usual metric in the Euclidean space. Let (f, pi, ..., Pn) be an incentive 
compatible mechanism. The mechanism with modified payments (f, p, ---, P),) 
is incentive compatible if and only if for some functions h; : V_; — N we have 
that Pi Vis.<5+ 5 Vn) = PiUts.00i9 Ua) + AU) for all v4, ¢« Une 


PROOF The “if” part is clear since h; has no strategic implications for player i, 
so we need only prove the only-if part. Assume that (f, p},..., p/,) is incentive 
compatible, and for the rest of the proof fix some i and some v_;. 

For every a € A denote V“ = {u; € V;| f(u;, v_;) = a}. Using Proposition 
9.27, the payment p(v;, v_;) is identical for all v; € V% and will be denoted 
by pa. Similarly we denote p/, = p'(v;, v_;) for some v; € V“. It now suffices to 
show that for every a,b € A, Pa — Pp = P), — Dh. 

For a,b € A we will say that a and b are close if for every € > 0 there exist 
v?, v? € V; such that ||v% — v?|| = maxcealvs(c) — v?(c)| < €,and f(v%, v_j) = 


a and hate v_;) = b. We will first prove the required py — pp = p!, — pj, for 
close a, b. Fix v%, v? € V; as in the definition of closeness. Since a bidder with 
type uv? does not gain by declaring v? with payments p, we have that v#(a) — pa > 
vi(b) — pp, and since a bidder with Bg does not gain by declaring v? we have that 
v?(a) — Pax v?(b) — pp. Putting together and rearranging we have that u/(b) — 
vi(a) < Pp — Pa < v? (b) — v? (a). Similarly, by considering the mechanism with 
payments p’ we have v#(b) — v#(a) < pj, — pl, < v?(b) — v?(a). But now recall 


that ||v7 — v? || < € and thus the upper bound and the lower bound for pp — pa 
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and for pj, — p/, are at most 2e apart and thus |(p, — pa) — (p, — p’,)| < 2€. 
Since € was arbitrary pp — Pa = P}, — Pi. 

To show pp — Pa = Pp), — p,, for general (not necessarily close) a and 
b, consider B = {b € Alp» — Pa = pi, — p,,}. Since pp — Pa = pi, — pi, and 
Pe — Pb = P. — Pp, implies p. — pa = p’. — p!, we have that no alternative in 
A — B can be close to any alternative in B. Thus V? = U,., V” has positive 
distance from its complement V4~? = LU) bgB V° contradicting the connectedness 


of V. 


It is not difficult to see that the assumption that V; is connected is essential, as for 
example, if the valuations are restricted to be integral, then modifying p; by any small 
constants € < 1/2 will not modify incentive compatibility. 

From this, and using the revelation principle, we can directly get many corollaries: 


(i) 


(ii) 


(iii) 


The only incentive compatible mechanisms that maximize social welfare are those 
with VCG payments. 

In the bilateral trade problem (Section 9.3.5.3) the only incentive compatible mech- 
anism that maximizes social welfare and makes no payments in case of no-trade 
is the one shown there which subsidizes the trade. More generally, if a mecha- 
nism for bilateral trade satisfies ex-post individual rationality, then it cannot dictate 
positive payments from the players in case of no-trade and thus it must subsidize 
trade. 

In the public project problem (Section 9.3.5.5) no ex-post individually rational mecha- 
nism that maximizes social welfare can recover the cost of the project. Again, the 
uniqueness of payments implies that if players with value 0 pay O (which is as 
much as they can pay maintaining individual rationality) then their payments in case 
of building the project must be identical to those obtained using the Clarke pivot 
tule. 


In Section 9.6.3 we will see a similar theorem in the Bayesian setting, a theorem 
that will strengthen all of these corollaries as well to that setting. 


9.5.6 Randomized Mechanisms 


All of our discussion so far considered only deterministic mechanisms. It is quite 
natural to allow also randomized mechanisms. Such mechanisms would be allowed to 
produce a distribution over alternatives and a distribution over payments. Alternatively, 
but specifying slightly more structure, we can allow distributions over deterministic 
mechanisms. This will allow us to distinguish between two notions of incentive com- 
patibility. 


Definition 9.38 


A randomized mechanism is a distribution over deterministic mechanisms (all with 
the same players, types spaces V;, and outcome space A). 


¢ A randomized mechanism is incentive compatible in the universal sense if every 


deterministic mechanism in the support is incentive compatible. 
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e¢ A randomized mechanism is incentive compatible in expectation if truth is a 
dominant strategy in the game induced by expectation. That is, if for all 7, all 
vj, v_;, and v;, we have that E[v;(a) — p;] => Elu;(a’) — p}], where (a, p;), and 
(a', p:) are random variables denoting the outcome and payment when i bids, 
respectively, v; and v;, and E[-] denotes expectation over the randomization of the 
mechanism. 


It is clear that incentive compatibility in the universal sense implies incentive com- 
patibility in expectation. For most purposes incentive compatibility in expectation 
seems to be the more natural requirement. The universal definition is important if play- 
ers are not risk neutral (which we do not consider in this chapter) or if the mechanism’s 
internal randomization is not completely hidden from the players. As we will see in 
Chapters 12 and 13 randomized mechanisms can often be useful and achieve more than 
deterministic ones. 

We will now characterize randomized incentive compatible mechanisms over single 
parameter domains. Recall the single parameter setting and notations from Section 
9.5.4. We will denote the probability that i wins by w;(v;, v_;) = Pri f(v;, v_i) € Wi] 
(probability taken over the randomization of the mechanism) and will use p;(v;, v_;) 
to directly denote the expected payment of i. In this notation the utility of player i with 
valuation v; when declaring v; is v; - w(v;, v_;) — p;(v;, v_;). For ease of notation we 
will focus on normalized mechanisms in which the lowest bid v? = r° loses completely 
w;(v?, v_;) = 0 and pays nothing pi(v?, v_;) = 0. 


Theorem 9.39 A normalized randomized mechanism in a single parameter do- 
main is incentive compatible in expectation if and only if for every i and every 
fixed v_; we have that 

(i) the function w;(v;, v_;) is monotonically non decreasing in v; and 


(ii) pi(vj, Vi) = Vj - W(Y;, V_i) — fo w(t, v_;)dt. 


PROOF In the proof we will simplify notation by removing the index 7 and the 
fixed argument v_; everywhere. In this notation, to show incentive compatibility 
we need to establish that vw(v) — p(v) => vw(v’) — pv’) for every v’. Plugging in 
the formula for p we get |, w(t)dt > A w(t)dt — (v' — v)w(v’). For v’ > v this 
is equivalent to (v’ — v)w(v’) > f - w(t)dt, which is true due to the monotonicity 
of w. For v’ < v we get (v — v’)w(v’) < 1 w(t)dt, which again is true due to the 
monotonicity of w. 

In the other direction, combining the incentive constraint at v, vw(v) — p(v) = 
vw(v’) — p(v’), with the incentive constraint at v’, v’w(v) — p(v) < v’w(v’) — 
p(v’), and subtracting the inequalities, we get (v’ — v)w(v) < (v’ — v)w(v’) which 
implies monotonicity of w. 

To derive the formula for p, we can rearrange the two incentive constraints as 


v-(w(v') — w(v)) < pv’) — p(v) < v’ - (w(v’) — w(v)). 


Now by letting v’ = v + ¢, dividing throughout by e€, and taking the limit, both 
sides approach the same value, v-dw/dv, and we get dp/dv=v-dw/dv. 
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Thus, taking into account the normalization condition p(v°) = 0, we have that 
P(v;) = ihe v- w'(v)dv, and integrating by parts completes the proof. (This seems 
to require the differentiability of w, but as w is monotone this holds almost ev- 


erywhere, which suffices since we immediately integrate.) 


We should point out explicitly that the randomization in a randomized mechanism is 
completely controlled by the mechanism designer and has nothing to do with any dis- 
tributional assumptions on players’ valuations as will be discussed in the next section. 


9.6 Bayesian—Nash Implementation 


So far in this chapter we have considered only implementation in dominant strategies 
(and the very similar ex-post-Nash). As mentioned in Section 9.4 this is usually consid- 
ered too strict a definition in economic theory. It models situations where each player 
has no information at all about the private information of the others — not even a prior 
distribution — and must operate under a “worst case” assumption. The usual working 
definition in economic theory takes a Bayesian approach, assumes some commonly 
known prior distribution, and assumes that a player that lacks some information will 
optimize in a Bayesian sense according to the information that he does have. The 
formalization of these notions, mostly by Harsanyi, was a major development in eco- 
nomic theory in the 1960s and 1970s, and is certainly still the dominant approach to 
handling lack of information in economic theory. In this section we will give these 
basic notions in the context of mechanism design, again limiting ourselves to settings 
with independent private values. 


9.6.1 Bayesian—Nash Equilibrium 


Definition 9.40 A game with (independent private values and) incomplete in- 
formation on a set of n players is given by the following ingredients: 


(i) For every player i, a set of actions X;. 


(ii) For every player 7, a set of types T;, and a prior distribution D; on T;. A value 
t; € T; is the private information that 7 has, and Dj;(¢;) is the a priori probability 
that i gets type fj. 

(iii) For every player i, a utility function u; : T; x X; x--: X X, > M, where 
uj(tj,X1,---,Xn,) is the utility achieved by player i, if his type (private infor- 
mation) is 7;, and the profile of actions taken by all players is x1, ..., Xn. 


The main idea that we wish to capture with this definition is that each player i must 
choose his action x; when knowing ¢; but not the other ¢;’s but rather only knowing 
the prior distribution D; on each other t;. The behavior of player i in such a setting is 
captured by a function that specifies which action x; is taken for every possible type ft; 
—this is termed a strategy. It is these strategies that we would want to be in equilibrium. 


Definition 9.41 A strategy of a player i is a function 5; : JT; — X;. A profile of 
strategies $1, ..., S, 18 a Bayesian-Nash equilibrium if for every player i and every 
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t; we have that s;(t;) is the best response that i has to s_;Q) when his type is ¢;, in 
expectation over the types of the other players. Formally: For all i, all t;, and all x‘: 
Ep_,(ui(t;, si(t;), 8-i(t_i))] = Ep_,[ui(t;, x}, s_j(t_i))] (where Ep_,[] denotes the 
expectation over the other types t_; being chosen according to distribution D_,;). 


This now allows us to define implementation in the Bayesian sense. 


Definition 9.42 A Bayesian mechanism for n players is given by (a) players’ 
type spaces 7;,..., 7;, and prior distributions on them D,,..., Dy, (b) players’ 
action spaces X;,..., X;, (c) an alternative set A, (d) players’ valuations func- 
tions v; : T; x A :— ®, (e) an outcome function a : X; x --- x X, — A,and (f) 
payment functions p;,..., Py, Where pj : X; X--- x X, > HR. 

The game with incomplete information induced by the mechanism is given by 
using the type spaces 7; with prior distributions D;, the action spaces X;, and the 


utilities u;(t;,%1,..-,X,) = Uj(t;, A(X], ---,Xn)) — Pi(%1,~--5 Xp). 

The mechanism implements a social choice function f : 7; x --- x T, > 
A in the Bayesian sense if for some Bayesian—Nash equilibrium s,,..., 5, of 
the induced game (s; : 7; — X;) we have that for all t),...,t,, f(ti,...,4) = 


a(si(t1), + +5 Sn(tn)). 


In particular it should be clear that every ex-post-Nash implementation is by defi- 
nition also a Bayesian implementation for any distributions D;. In general, however, 
being a Bayesian implementation depends on the distributions D; and there are many 
cases where a Bayesian—Nash equilibrium exists even though no dominant-strategy 
one does. A simple example — a first price auction — is shown in the next subsection. 
Just like in the case of dominant-strategy implementations, Bayesian implementations 
can also be turned into ones that are truthful in a Bayesian sense. 


Definition 9.43 A mechanism is truthful in the Bayesian sense if (a) it is “direct 
revelation”; i.e., the type spaces are equal to the action spaces T; = X;, and (b) 
the truthful strategies s;(¢;) = t; are a Bayesian—Nash equilibrium. 


Proposition 9.44 (Revelation principle) Jf there exists an arbitrary mecha- 
nism that implements f in the Bayesian sense, then there exists a truthful 
mechanism that implements f in the Bayesian sense. Moreover, the expected 
payments of the players in the truthful mechanism are identical to those, obtained 
in equilibrium, in the original mechanism. 


The proof is similar to the proof of the same principle in the dominant-strategy 
setting given in Proposition 9.25. 


9.6.2 First Price Auction 


As an example of Bayesian analysis we study the standard first price auction in a 
simple setting: a single item is auctioned between two players, Alice and Bob. Each 
has a private value for the item: a is Alice’s value and b is Bob’s value. While we 
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already saw that a second price auction will allocate the item to the one with higher 
value, here we ask what would happen if the auction rules are the usual first-price ones: 
the highest bidder pays his bid. Certainly Alice will not bid a since if she does even 
if she wins her utility will be 0. She will thus need to bid some x < a, but how much 
lower? If she knew that Bob would bid y, she would certainly bid x = y + € (as long 
as x <a). But she does not know y or even b which y would depend on — she only 
knows the distribution Dgop over b. 

Let us now see how this situation falls in the Bayesian—Nash setting described 
above: The type space Tajice of Alice and Tgp of Bob is the nonnegative real numbers, 
with tajice denoted by a and tgop denoted by b. The distributions over the type space 
are Dajice aNd Dgop. The action spaces X tice and X pop are also the non-negative real 
numbers, with xXajice denoted by x and xgop denoted by y. The possible outcomes are 
{Alice-wins, Bob-wins}, with vajice(Bob-wins) = 0 and vajice(Alice-wins) = a (and 
similarly for Bob). The outcome function is that Alice-wins if x > y and Bob-wins 
otherwise (we arbitrarily assume here that ties are broken in favor of Alice). Finally, 
the payment functions are Patice = 0 whenever Bob-wins and patice = x whenever 
Alice-wins , while pgob = y whenever Bob-wins and pgop = 0 whenever Alice-wins. 
Our question translates into finding the Bayesian—Nash equilibrium of this game. 
Specifically we wish to find a strategy Satice for Alice, given by a function x(a), and a 
strategy Spon for Bob, given by the function y(b), that are in Bayesian equilibrium, i.e., 
are best-replies to each other. 

In general, finding Bayesian—Nash equilibria is not an easy thing. Even for this very 
simple first price auction the answer is not clear for general distributions Datice and 
Dov. However, for the symmetric case where Dalice = Dov, the situation is simpler 
and a closed form expression for the equilibrium strategies may be found. We will 
prove it for the special case of uniform distributions on the interval [0, 1]. Similar 
arguments work for arbitrary nonatomic distributions over the valuations as well as for 
any number of bidders. 


Lemma 9.45 = [na first price auction among two players with prior distributions 
of the private values a, b uniform over the interval [0, 1], the strategies x(a) = a/2 
and y(b) = b/2 are in Bayesian—Nash equilibrium. 


Note that in particular x < y if and only if a < b thus the winner is also the player 
with highest private value. This means that the first price auction also maximizes social 
welfare, just like a second-price auction. 


PROOF Letus consider which bid x is Alice’s optimal response to Bob’s strategy 
y = b/2, when Alice has value a. The utility for Alice is 0 if she loses and 
a — x if she wins and pays x, thus her expected utility from bid x is given by 
UAtice = Pr[Alice wins with bid x] - (a — x), where the probability is over the 
prior distribution over b. Now Alice wins if x > y, and given Bob’s strategy 
y = b/2, this is exactly when x > b/2. Since b is distributed uniformly in [0, 1] 
we can readily calculate this probability: 2x for 0 < x < 1/2, 1 for x > 1/2, and 
0 for x < 0. It is easy to verify that the optimal value of x is indeed in the range 
0 <x < 1/2 (since x = 1/2 is clearly better than any x > 1/2, and since any 
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x <0 will give utility 0). Thus, to optimize the value of x, we need to find the 
maximum of the function 2x(a — x) over the range 0 < x < 1/2. The maximum 
may be found by taking the derivative with respect to x and equating it to 0, which 
gives 2a — 4x = 0, whose solution is x = a/2 as required. 


9.6.3 Revenue Equivalence 


Let us now attempt comparing the first price auction and the second price auction. The 
social choice function implemented is exactly the same: giving the item to the player 
with highest private value. How about the payments? Where does the auctioneer get 
higher revenue? One can readily express the revenue of the second-price auction as 
min (a, b) and the revenue of the first-price auction as max(a/2, b/2), and it is clear 
that each of these expressions is higher for certain values of a and b. 

But which is better on the average — in expectation over the prior distributions of 
a and b? Simple calculations will reveal that the expected value of min(a, b) when 
a and b are chosen uniformly in [0, 1] is exactly 1/3. Similarly the expected value of 
max(a/2, b/2) when a and b are chosen uniformly in [0, 1] is also exactly 1/3. Thus 
both auctions generate equivalent revenue in expectation! This is no coincidence. It 
turns out that in quite general circumstances every two Bayesian—Nash implementations 
of the same social choice function generate the same expected revenue. 


Theorem 9.46 (The Revenue Equivalence Principle) Under certain weak as- 
sumptions (to be detailed in the proof body), for every two Bayesian—Nash imple- 
mentations of the same social choice function f , we have that if for some type s 
of player i, the expected (over the types of the other players) payment of player i 
is the same in the two mechanisms, then it is the same for every value of t;. In par- 
ticular, if for each player i there exists a type i where the two mechanisms have 
the same expected payment for player i, then the two mechanisms have the same 
expected payments from each player and their expected revenues are the same. 


Thus, for example, all single-item auctions that allocate (in equilibrium) the item to 
the player with highest value and in which losers pay 0, will have identical expected 
revenue. 

The similarity to Theorem 9.37 should be noted: in both cases it is shown that the 
allocation rule determines the payments, up to a normalization. In the case of dominant 
strategy implementation, this is true for every fixed type of the other players, while in 
the case of Bayesian—Nash implementation, this is true in expectation over that types 
of the others. The proofs of the two theorems look quite different due to technical 
reasons. The underlying idea is the same: take two “close” types, then the equations 
specifying that for neither type does a player gain by misrepresenting himself as the 
other type, put together, determine the difference in payments in terms of the social 
choice function. 


PROOF Using the revelation principle, we can first limit ourselves to mecha- 
nisms that are truthful in the Bayesian—Nash sense. Let us denote by V; the space 
of valuation functions v;(t;, -) over all f;. 
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Assumption 1 Each V; is convex. (Note that this holds for essentially every 
example we had so far. This condition can be replaced by path-connectedness, 
and the proof becomes just slightly messier.) 


Take any type t/ € 7;. We will derive a formula for the expected payment 
for this type that depends only on the expected payment for type f? and on 
the social choice function f. Thus any two mechanisms that implement the 
same social choice function and have identical expected payments at 1° will 
also have identical expected payments at ¢;'. For this, let us now introduce some 
notations: 


° vy? is the valuation v(t, -). v! is the valuation v(t} , +). We will look at these as vectors 
(in V; C 94), and look at their convex combinations v* = v? + A(v; — v9). The 
convexity of V; implies that v* € V; and thus there exists some type #? such that 
v* = v(t}, +). 

* p* is the expected payment of player i at type t*: p* = E;, pi(ti, t_i). 

¢ w* is the probability distribution of f(t,-), ie., for every ae A w*(a) = 
Pr, Uf (@, ti) =). 


Assumption 2 w* is continuously differentiable in A. (This assumption is not 


really needed, but allows us to simply take derivatives and integrals as convenient.) 

Once we have this notation in place, the proof is easy. Note that under these 
notations the expected utility of player i with type } that declares a is given 
by the expression v* - w* — p’. Since a player with type ¢} prefers reporting the 
truth rather than t*** we have that v* - w* — p* > v*. w*** — p**©, Similarly, 
a player with type es prefers reporting the truth rather than 7, so we have 
urte . w* — p* < vt. w*t* — p+, Re-arranging and putting together, we get 


v*(wité _ w*) < a => p* < yr te(yrt m4 w*) 


n 


Now divide throughout by € and let € approach 0. v*t* approaches v’, (w*** — 


w*)/e€ approaches the vector dw*/dd = w'(A) and thus we get that (p*t* — 
p*)/€) approaches v* - w’(A), and thus the derivative of p* is defined and is 
continuous. Integrating, we get p! = p® + rh uv. w'(A)dd. 


n 


Thus the revenue equivalence theorem tells us that we cannot increase revenue 
without changing appropriately the allocation rule (social choice function) itself. In 
particular, all the corollaries in Section 9.5.5 apply, in the sense of expectation, to 
all Bayesian—Nash implementations. However, if we are willing to modify the social 
choice function, then we can certainly increase revenue. Here is an example for the 
case of an auction with two bidders with valuations distributed uniformly in [0, 1]: 
Put a reservation price of 1/2, and then sell to the highest bidder for a price that is the 
maximum of the low bid and the reservation price, 1/2. If both bidders bid below the 
reservation price, then none of them wins. First, it is easy to verify that this rule is 
incentive compatible. Then a quick calculation will reveal that the expected revenue of 
this auction is 5/12 which is more than the 1/3 obtained by the regular second price 
or first price auctions. Chapter 13 discusses revenue maximization further. 
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9.7 Further Models 


This chapter has concentrated on basic models. Here we shortly mention several model 
extensions that address issues ignored by the basic models and have received attention 
in economic theory. 


9.7.1 Risk Aversion 


All of our discussion in the Bayesian model assumed that players are risk-neutral: 
obtaining a utility of 2 with probability 1/2 is equivalent to obtaining a utility of 1 with 
probability 1. This is why we could just compute players’ utilities by taking expectation. 
In reality, players are often risk-averse, preferring a somewhat lower utilities if they are 
more certain. A significant body of work in economic theory deals with formalizing and 
analyzing strategic behavior of such players. In our context, a particularly interesting 
observation is that the revenue equivalence principle fails and that with risk-averse 
bidders different mechanisms that implement the same social choice function may 
have different revenue. As an example it is known that first price auctions generate 
more revenue than second price auctions if the bidders are risk-averse. 


9.7.2 Interdependent Values 


We have considered only independent private value models: the types of the players 
are chosen independently of each other and each players’ valuation depends only on 
his own private information. In a completely general setting, there would be some joint 
distribution over “‘states of the world” where such a state determines the valuations of all 
players. Players would not necessarily get as private information their own valuation, 
but rather each would get some “signal” — partial information about the state of the 
world — that provide some information about his own valuation and some about the 
valuations of others. Most of the results in this chapter cease holding for general models 
with interdependent values. 

A case that is in the extreme opposite to the private value model is the “common 
value” model. In an auction of a single item under this model, we assume that the 
object in question has exactly the same value for all bidders. The problem is that 
none of them know exactly what this value is and each player’s signal only provides 
some partial information. An example is an auction for financial instruments such as 
bonds. Their exact value is not completely known as it depends on future interest 
rates, the probability of default, etc. What is clear though is that whatever value the 
bonds will turn out to have, it will be the same for everyone. In such settings, an 
auction really serves as an information aggregation vehicle, reaching a joint estimate 
of the value by combining all players’ signals. A common pitfall in such cases is 
the ““winner’s curse”: if each bidder bids their own estimate of the object’s common 
value, as determined from their own signal, then the winner will likely regret winning 
— the fact that a certain bidder won means that other signals implied a lower value, 
which likely means that the real value is lower than the estimate of the winner. Thus 
in equilibrium bidders must bid an estimate that is also conditioned on the fact that 
they win. 
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A commonly considered formalization that takes into account both a private value 
component and a common value component is that of affiliated signals. Roughly speak- 
ing, in such models each player gets a signal that is positively correlated (in a strong 
technical sense called affiliation) not only with his own value but also with the values 
of other players. In such settings, ascending English auctions are “better” (generate 
more revenue) than the non-adaptive second price auction (which is equivalent to an 
English auction in private value models): as the bidding progresses, each bidder gets 
information from the other bidders that increases his estimate of his value. 


9.7.3 Complete Information Models 


Our main point of view was that each player has its own private information. Some 
models consider a situation where all players have complete information about the 
game; itis only the mechanism designer who is lacking such information. A prototypical 
instance is that of King Solomon: two women, each claiming that the baby is hers. The 
women both know who the real mother is, but not King Solomon — he must design 
a mechanism that elicits this information from their different preferences. Several 
notions of implementation in such setting exists, and in general, mechanism design 
is much easier in this setting. In particular, many implementations without money are 
possible. 


9.7.4 Hidden Actions 


All of the theory of Mechanism Design attempts overcoming the problem that players 
have private information that is not known to the mechanism designer. In many settings 
a different stumbling block occurs: players may perform hidden actions that are not 
visible to the “mechanism.” This complementary difficulty to the private information 
difficulty has been widely studied in economics and has recently started to be considered 
in computer science settings. 


9.8 Notes 


Most of the material in this chapter can be found in graduate textbooks on micro- 
economics such as Mas-Collel et al. (1995). The books (Krishna, 2002; Klemperer, 
2004) on Auction theory contain more detail. As the Internet gained influence, during 
the 1990s, researchers in AI, computer networks, and economics started noticing that 
mechanism design can be applied in computational settings. This was put forward 
in a general way in Nisan and Ronen (2001) who also coined the term Algorithmic 
Mechanism Design. 

The earliest work on voting methods including that of Condorcet and Borda goes 
back to the late 18th century, appropriately around the time of the French Revolution. 
The modern treatment of social choice theory originates with the seminal work of Arrow 
(1951), where Arrow’s theorem also appears. Over the years many proofs for Arrow’s 
theorem have been put forward; we bring one of those in Geanakopolos (2005). The 
Gibbard-Satterthwaite theorem is due to Gibbard (1973) and Satterthwaite (1975). The 
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computational difficulty of manipulation of voting rules was first studied in Bartholdi 
et al. (1989). 

The positive results in Mechanism Design in the quasi-linear setting originate with 
the seminal work of Vickrey (1961), who, in particular, studied single-item auctions 
and multiunit auctions with downward sloping valuations. The public project problem 
was studied by Clarke (1971), who also suggested the pivot rule, and the general 
formulation of what is now called VCG mechanisms appears in Groves (1973). The 
Bilateral Trade problem was studied in Myerson and Satterthwaite (1983), and the 
application of buying a path in a network was put forward in Nisan and Ronen (2001). 

The general framework of Mechanism Design and its basic notions have evolved 
in microeconomic theory mostly in the 1970s, and mostly in the general Bayesian 
setting that we only get to in Section 9.6. Among the influential papers in laying out the 
foundations are Vickrey (1961), Clarke (1971), Groves (1973), Satterthwaite (1975), 
Green and Laffont (1977), Dasgupta et al. (1979), and Myerson (1981). 

Early papers in algorithmic Mechanism Design, such as Nisan and Ronen (2001) and 
Lehmann et al. (2002), pointed out the necessity and difficulty of implementing social 
choice functions other than welfare maximization, due to other optimization goals or 
due to computational hardness. Characterizations of incentive compatible mechanisms 
have been previously obtained in economic theory as intermediate steps on the way to 
theorems with clear economic motivation. The discussion here tries to put it all together 
independently of particular intended applications. The weak monotonicity condition is 
from Bikhchandani et al. (2006) and the sufficiency of this condition in convex domains 
is from Saks and Yu (2005). The affine-maximization characterization in complete 
domains is from Roberts (1979), and Lavi et al. (2003) attempts generalization to other 
domains. The uniqueness of pricing is the analog of the revenue equivalence theorem in 
the Bayesian setting which is due to Myerson (1981); Green and Laffont (1977) showed 
it in the dominant strategy setting for welfare maximizing social choice functions. The 
corollary of the impossibility of budget-balanced bilateral trade appears in Myerson 
and Satterthwaite (1983) in the Bayesian setting. 

The Bayesian setting is currently the main vehicle of addressing lack of information 
in economic theory, and this development has mostly happened during the 1960s, 
with the main influence being the seminal work of Harsanyi (1968). As mentioned 
previously, most of development of the field of Mechanism Design noted above was 
in this setting. The revenue equivalence theorem, the form of the expected payment in 
single-parameter domains, as well as an analysis of revenue-maximizing auctions is 
from Myerson (1981). 

Risk-averse bidders in (reverse) auctions are analyzed by Holt (1980). Auctions 
in the common value model are analyzed in Wilson (1977) and Milgrom (1981). 
The general model of interdependent valuations with affiliated signals was studied in 
Milgrom and Weber (1982). Mechanism Design in complete information models is 
discussed in Maskin (1985) and Moore and Repullo (1988). 
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CHAPTER 10 


Mechanism Design without 
Money 


James Schummer and Rakesh V. Vohra 


Abstract 


Despite impossibility results on general domains, there are some classes of situations in which there 
exist interesting dominant-strategy mechanisms. While some of these situations (and the resulting 
mechanisms) involve the transfer of money, we examine some that do not. Specifically, we analyze 
problems where agents have single-peaked preferences over a one-dimensional “public” policy space; 
and problems where agents must match with each other. 


10.1 Introduction 


The Gibbard—Satterthwaite Theorem (Theorem 9.8) is a Procrustean bed! that is es- 
caped only by relaxing its assumptions. In conjunction with the Revelation Principle 
(Proposition 9.25), it states that on the general domain of preferences, only dictatorial 
rules can be implemented in dominant strategies (if the range contains at least three 
alternatives). In this chapter we escape Procrustes by examining dominant strategy 
implementation on restricted domains of preferences.” 

In most applications it is clearly unreasonable to assume that agents’ preferences 
are completely unrestricted, as was assumed in the voting context of Section 9.2.4. 
For instance, in situations involving the allocation of goods, including money, one can 
safely assume that each agent prefers to receive more money (or other goods). As can 
be seen in the following chapters, the ability for agents to make monetary transfers 
allows for a rich class of strategy-proof rules. 

Nevertheless there are many important environments where money cannot be used as 
a medium of compensation. This constraint can arise from ethical and/or institutional 


' Procrustes was a giant that lived by one of the roads that led to Attica. He boasted of a bed whose length exactly 
matched the size of its occupant. What he neglected to mention was that this remarkable feature was obtained 
by either stretching or butchering his guest to fit the bed. 

2 Other avenues of escape not discussed here include randomization, making preferences common knowledge, 
and using weaker notions of implementation. 
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considerations: many political decisions must be made without monetary transfers; 
organ donations can be arranged by “trade” involving multiple needy patients and 
their relatives, yet monetary compensation is illegal. In this chapter we focus on a few 
examples of just this kind. 

Before proceeding with the examples, we formalize the idea that dominant- 
strategy implementation is a weaker concept on restricted domains of preferences. 
In general, a decision problem can be described by these parameters: a set of 
agents N = {1,2,...,}, a set of alternatives A, and for each agent i € N a set of po- 
tential preference relations R; over the alternatives in A.? The Gibbard-Satterthwaite 
Theorem (Theorem 9.8) applies, for example, when each 7; is the entire set of linear 
orders on A. 

An allocation rule is a function f: x R; — A, mapping preferences of the agents 
into alternatives. It is strategy-proof if its use makes it a weakly dominant strategy 
for agents to truthfully report their preferences. (See Section 9.4). We observe the 
following principle. 

Consider two decision problems (N, A, Ri,..., Rn) and (N, A, Rj,..., Ri), 
where Fe; C R; for each i € N. Suppose f: x R; > A is a strategy-proof rule for 
the former problem. Then the restriction of the function f to (x7;), namely f|xR:, 
defines a strategy-proof rule for the latter problem. 

The proof of this is straightforward: on a smaller domain of preferences, strategy- 
proofness is easier to satisfy because it imposes strictly fewer constraints. This simple 
observation justifies the search for reasonable (or at least nondictatorial) rules for 
decision problems involving “smaller” domains of preferences than those that yield the 
Gibbard-Satterthwaite Theorem. 

In Section 10.2 we analyze a problem involving a natural domain restriction when 
agents vote over one-dimensional policies. It is one of the canonical “public good” 
settings (R; = R, for all i, j € N) in which interesting, strategy-proof rules can 
be obtained. The analysis here is illustrative of the approach used to characterize 
such rules in other environments. In Sections 10.3 and 10.4 we analyze matching 
problems. As opposed to the previous setting, these problems have the feature that 
each agent cares only about his own private consumption; that is, each R; con- 
tains only preference relations that are sensitive only to certain dimensions of the 
alternative space A; hence R; # #,; whenever i 4 j. These are examples of what 
are called “private good” problems. Two kinds of matching problems are analyzed, 
demonstrating the limits of what can be implemented in dominant strategies in such 
environments. 


10.2 Single-Peaked Preferences over Policies 
A simple but elegant class of domains involves single-peaked preferences over one- 


dimensional policy spaces. This domain can be used to model political policies, eco- 
nomic decisions, location problems, or any allocation problem where a single point 


3 A preference relation is a weak order on A. 
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must be chosen in an interval. The key assumption we make is that agents’ preferences 
are assumed to have a single most-preferred point in the interval, and that preferences 
are “decreasing” as one moves away from that peak. 

Formally, the allocation space (or policy space) is the unit interval A = [0, 1]. An 
outcome in this model is a single point x € A. Each agent i € N has a preference 
ordering >; (i.e., a weak order) over the outcomes in [0, 1]. The preference relation 
>; 1s single-peaked if there exists a point p; € A (the peak of >;) such that for all 
x € A\ {p;} and all 4 € [0, 1), (Ax +(1 —A)p;) >; x.4 Let R denote the class of 
single-peaked preferences. 

We denote the peaks of preference relations >;, >/, >), etc., respectively by p;, p;, 
pj;, etc. Denote a profile (n-tuple) of preferences as > € R”. 

One can imagine this model as representing a political decision such as an income 
tax rate, another political issue with conservative/liberal extremes, the location of a 
public facility on a road, or even something as simple as a group of people deciding 
on the temperature setting for a shared office. In these and many other examples, the 
agents have an ideal preferred policy in mind, and would prefer that a decision be made 
as close as possible to this “peak.” 

Arule f: R” — A assigns an outcome f (>) to any preference profile >. As before, 
atule is strategy-proof if itis a dominant strategy for each agent to report his preferences 
truthfully when the rule is being used to choose a point. 

In contrast to the impossibility result of Gibbard (1973) and Satterthwaite (1975), 
that obtain on the universal domain of preferences, we shall see that this class of 
problems admits a rich family of strategy-proof rules whose ranges include more than 
two alternatives. In fact, the family of such rules remains rich even when one restricts 
attention (as we do in this chapter) to rules that satisfy the following condition. 

We say that arule f is onto if forall x € A there exists > € R” such that f(=) = x. 
An onto rule cannot preclude an outcome from being chosen ex ante. It is not without 
loss of generality to impose this condition. For instance, fix two points x, y € [0, 1] and 
consider a rule that chooses whichever of the two points is preferred to the other by a 
majority of agents (and where x is chosen in case of a tie). Such arule is strategy-proof, 
but not onto. Similar strategy-proof rules can even break ties between x and y by using 
preference information about other points x’, y’,..., in [0, 1], even though x’, etc., are 
not in the range of the rule. 

The onto condition is even weaker than what is called unanimity, which requires 
that whenever all agents’ preferences have the same peak (p; = pj; for alli, j), the rule 
must choose that location as the outcome. In turn, unanimity is weaker than Pareto- 
optimality: for all > € R”, there exists no point x € [0, 1] such that x >; f(=) for all 
ieN. 

As it turns out, these three requirements are all equivalent among strategy-proof 
tules. 


Lemma 10.1 Suppose f is strategy-proof. Then f is onto if and only if it is 
unanimous if and only if it is Pareto-optimal. 


4 The binary relation >; is the strict (asymmetric) part of >;. Under a single-peaked preference relation, preference 
is strictly decreasing as one moves away from p;. 
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PROOF It is clear that Pareto-optimality implies the other two conditions. Sup- 
pose f is strategy-proof and onto. Fix x € [0,1] and let > € 7?” be such that 
f(=) = x. Consider any “unanimous” profile >’ € R” such that p; = x for 
each i € N. By strategy-proofness, f(>', =2,..-, =n) = x, otherwise agent 1 
could manipulate f. Repeating this argument, f(>), >5,>3,..-, =n) =%,---, 
f(z’) = x. That is, f is unanimous. 

Finally, to derive a contradiction, suppose that f is not Pareto-optimal at some 
profile > € #”. This implies that either @) f(=) < p; for alli € N or Gi) f(=) > 
Dp; for alli € N. Without loss of generality, assume (i) holds. Furthermore, assume 
that the agents are labeled so that pj < po <--- < py. 

If p; = p, then unanimity is violated, completing the proof. Otherwise, let 
J €N be such that p; = p; < pj4+1; that is, j <n agents have the minimum 
peak. For all i > j, let >’ be a preference relation such that both p; = p; and 
FSH) pe 

Letx, = f(=1,..., =n-1, =',). By strategy-proofness, x, € [f(=), Pn], other- 
wise agent n (with preference >’ ) could manipulate f by reporting preference >). 
Similarly, x, ¢ (f(=), pn], otherwise agent n (with preference >,,) could manip- 
ulate f by reporting preference >',. Therefore x, = f(>). 

Repeating this argument as each i > j replaces >; with >/, we have 


Fee geo nee ag Cz) 


which contradicts unanimity. Since a strategy-proof, onto rule must be unanimous, 
this is a contradiction. 


10.2.1 Rules 


The central strategy-proof rule on this domain is the simple median-voter rule. Suppose 
that the number of agents n is odd. Then the rule that picks the median of the agents’ 
peaks (p;’s) is a strategy-proof rule. 

It is straightforward to see why this rule is strategy-proof : If an agent’s peak p; lies 
below the median peak, then he can change the median only by reporting a preference 
relation whose peak lies above the true median. The effect of this misreport is for 
the rule to choose a point even further away from p;, making the agent worse off. A 
symmetric argument handles the case in which the peak is above the median. Finally, 
an agent cannot profitably misreport his preferences if his peak is the median one to 
begin with. 

More generally, for any number of agents n and any positive integer k <n, the 
tule that picks the kth highest peak is strategy-proof for precisely the same reasons as 
above. An agent can only move the kth peak further from his own. The median happens 
to be the case where k = (n + 1)/2. 

The strategy-proofness of such rules stands in contrast to the incentives properties 
of rules that choose average-type statistics. Consider the rule that chooses the average 
of the n agents’ peaks. Any agent with peak p; € (0, 1) that is not equal to the average 
can manipulate the rule by reporting preferences with a more extreme peak (closer to 
0 or 1) than his true peak. 
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This would also hold for any weighted average of the agents’ peaks, with one 
exception. If a rule allocated all of the weight to one agent, then the resulting rule 
simply picks that agent’s peak always. Such a dictatorial rule is strategy-proof and 
onto. 

In addition to favorable incentives properties, rules based on order statistics have 
the property that they require little information to be computed. Technically a rule 
requires agents to report an entire preference ordering over [0, 1]. The rules we have 
discussed so far, however, only require agents to report their most preferred point, i.e., 
a single number. In fact, under the onto assumption, this informational property is a 
consequence of the strategy-proofness requirement; that is, all strategy-proof and onto 
tules have the property that they can be computed solely from information about the 


agents’ peaks. 
To begin showing this, we first observe that the class of “kth-statistic rules” can 
be further generalized as follows. Consider a fixed set of points y;, y2,..., Yn—1 € A. 


Consider the rule that, for any profile of preferences >, chooses the median of the 
2n — | points consisting of the n agents’ peaks and the n — 1 points of y. This kind of 
tule differs from the previous ones in that, for some choices of y and some profiles of 
preferences, the rule may choose a point that is not the peak of any agent’s preferences. 
Yet, for the same reasons as above, such a rule is strategy-proof. 

It turns out that such rules compose the entire class of strategy-proof and onto 
tules that treat agents symmetrically. To formalize this latter requirement, we call a 
rule anonymous if for any > € R” and any permutation >’ of >, f(=’) = f(=). This 
requirement captures the idea that the agents’ names play no role in the behavior of 
a rule. Dictatorial rules mentioned above are examples of rules that are strategy-proof 
and onto, but not anonymous. 


Theorem 10.2. A rule f is strategy-proof, onto, and anonymous if and only if 
there exist y1, y2,---; Yn—1 € [0, 1] such that for all > € R", 


S(=) = med{p, pr, .--. Pas V1, Y2,+--s Yn—1}- (10.1) 


PROOF We leave it as an exercise to verify that such a rule satisfies the three 
axioms in the Theorem. To prove the converse, suppose f is strategy-proof, onto, 
and anonymous. 

We make extensive use of the two (extreme) preference relations that have 
peaks at 0 and 1 respectively. Since preferences relations are ordinal, there is only 
one preference relation with a peak at 0 and only one with a peak at 1. Denote 
these two preference relations by >? and >} respectively. 

(Construct the y's.) For any 1 < m <n —1, let y,, denote the outcome of f 
when m agents have preference relation >} and the remainder have >?: 


0 0 1 1 
5 ee 8 an eee oid ae aes): 


Recall that by anonymity the order of the arguments of f is irrelevant; if pre- 
cisely m agents have preference relation >} and the rest have >? then the out- 
come is y,,. In addition, we leave it to the reader to verify that stragegy proofness 
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implies monotonicity of the yy’S: Ym < Yn41 for each 1 < m <n — 2. We prove 
the theorem by showing that f satisfies Eq. (10.1) with respect to this list of y,,’s. 

Consider a profile of preferences > € 7” with peaks p;,..., Dx. Without loss 
of generality (by anonymity) assume that p; < pj+, for eachi <n — 1. We wish 
to show f(=) = x* = med{p,..., Dns Ys -+ +> Yn—1}- 

(Case I: the median is some ym.) Suppose x* = y, for some m. By mono- 
tonicity of the peaks and y,,’s, since x* is the median of 2” — 1 points this implies 
Pn—m < X* = Ym < Pn—m+1. By assumption, 


eS ~~ 0 0 1 1 
beans eats | Gat Oe as ae <a ee eRe ar F (10.2) 
0 0 1 1 : . : : 
Let x1 = f(E1, 29) +++) Zp ms Zn—m4 1 +++» Zy)- Strategy-proofness implies x; 


> x*, otherwise agent 1 with preference = could manipulate f. Similarly, since 
Pi < Ym, we cannot have x; > x*, otherwise agent 1 with preference >; could 
manipulate f. Hence x; = x*. Repeating this argument for all i <n —m, x* = 
f(E1, + Ens =p emp p> +++ Zn): The symmetric argument for all i > n —m 
implies 


PCR rites a) Sako (10.3) 


(Case 2: the median is an agent’ s peak.) The remaining case is that y,, < x* < 
¥m+1 for some m. (The cases where x* < y, and x* > y,_, are similar, denoting 
yo = O and y, = 1.) In this case, since the agents’ peaks are in increasing order, 
we have x* = Pn—m- 

If 


0 0 1 1 
f ( yy bees Cy m po Zn-m> =p Ree eee el Ge ag ee (10.4) 


then, analogous to the way Eq. (10.2) implied Eq. (10.3), repeated applications 
of strategy-proofness (to the n — 1 agents other than i = n — m) would imply 
f(@1,..--; =n) = x*, and the proof would be finished. The remainder of the 
proof is devoted to showing that indeed Eq. (10.4) must hold. 

Suppose to the contrary that 


0 0 il 1 
Sree he oS eee Se ee, 05) 


(The case x’ > x* can be proven symmetrically.) If agent (n — m) were to report 
preference = instead, f would choose outcome y,,; hence strategy-proofness 
implies y,, < x’ < x*. See Figure 10.1. 

Denote the outcomes that agent (n — m) can obtain by varying his preferences, 
fixing the others, as> 


OS ea Seas Ce aire es ee oe ene 


By definition, x’ € O; Case 1 implies ym, ¥n+41 € O. Strategy proofness implies 
that x’ = max{x € O: x < x*}, otherwise by reporting some other preference, 
agent (n — m) could obtain some x € (x’, x*), violating strategy proofness. 


5 The literature on strategy proofness refers to this as an option set. 
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Figure 10.1. Proof of Theorem 10.2. If a strategy-proof, onto rule does not pick x* when it is 

the median of peaks and y,,’s, then a contradiction is reached using preferences with peaks at 
L H 

p; and p;". 


Letting x” = inf{x € O: x > x*}, strategy proofness implies x" € O.° To 
see this, let >/_,, be a preference relation with peak p/_,, =x" and such 


that (x” + €)>"_,, x’ for some small € > 0. Then strategy proofness implies 


Fae pt a en SP Lee" +e) Bute x", 
then there would exist a misreport resulting in an outcome arbitrarily closer to 
x", making agent (n — m) (with preference >''_,,) better off. Hence = x” = 
min{x € O: x > x*}. With Eq. (10.5), we have x” > x*. 

We have shown that ON (x’,x”) =. Let p/ be a symmetric preference 
relation with peak at p” = (x’ + x”)/2 — e, where ¢ > 0 is sufficiently small; 
see Figure 10.1. Similarly let pt be a symmetric preference relation with peak at 


(x’ + x"”)/2 + €. Then strategy-proofness implies 


0 0 H 1 1 
fl Sie, m 1» Zn mreen perience 


By repeated application of strategy-proofness (along the lines used in proving 
Eq. (10.3)), this implies 


L L H 1 1 
laGaceer res m lo =n Weton eer ees ee 


Lemma 10.1 (Pareto-optimality) implies 


fag et ce) Sen 


1? > on—m—-1? —n—m? —n-—m4+1?°***? 


Therefore, strategy-proofness implies 


L L L 1 1 Ih 
f ( Fp eeees Zh m =n Aeon Aas, ) =x (10.6) 


otherwise agent n — m could manipulate at one of the two profiles (since ¢ is 
small). 
On the other hand, strategy-proofness implies 


0 0 L 1 i errr 
Pie as m =n mech wigiesk = Se 


by the definition of >/. Strategy-proofness implies that if agent (n — m — 1) 
instead reports preference >“, a point must be chosen that is in the interval 
[x’, x” — 2e], otherwise, he could report >° to gain. By repeated application of 
this argument, this continues to hold as each agent 1 <i <n — m — 1 changes 
his report from 4 to =P, so 


L L L 1 1 Poadtt 
FUR m icp rat mene ye le ae — 28], 


© More generally, strategy-proofness alone implies O is closed. For brevity we prove only x” € O. 
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This contradicts Eq. (10.6). Hence Eq. (10.5) cannot hold, so x’ > x*; the 
symmetric argument implies x’ = x*, resulting in Eq. (10.4). Thus f chooses the 
median of these 2n — 1 points for profile >. 


The parameters (y,,’s) in Theorem 10.2 can be thought of as the rule’s degree of 
compromise when agents have extremist preferences. If m agents prefer the highest 
possible outcome (1), while n — m prefer the lowest (0), then which point should 
be chosen? A true median rule would pick whichever extreme (0 or 1) contains 
the most peaks. On the other hand, the other rules described in the Theorem may 
choose intermediate points (y,,) as a compromise. The degree of compromise (which 
Ym) can depend on the degree to which the agents’ opinions are divided (the size 
of m). 

The anonymity requirement is a natural one in situations where agents are to be 
treated as equals. If one does not require this, however, the class of strategy-proof 
rules becomes even larger. We have already mentioned dictatorial rules, which always 
chooses a predetermined agent’s peak. There are less extreme violations of anonymity: 
The full class of strategy-proof, onto rules, which we now define, allows agents to be 
treated with varying degrees of asymmetry. 


Definition 10.3. A rule f is a generalized median voter scheme (g.m.v.s.) if 
there exist 2” points in [0, 1], {as}scy, such that 


@) SCT CN implies as < az, 
(ii) ag = 0, ay = 1, and 


Gii) for all > € R”", f(=) = maxscy min{as, pj: i € S}. 


An example is given below. It is worth making two observations regarding Defi- 
nition 10.3. First, the monotonicity condition (i) is actually redundant. If parameters 
{as}scw fail this condition, they still define some strategy-proof rule via condition (iii). 
However, the resulting rule could also be defined by an alternate set of parameters 
{o'5}scw that do satisfy condition (i). Second, condition (ii) is present merely to guar- 
antee the rule to be onto. Parameters that fail this condition still define a strategy-proof 
rule whose range is [ag, ay ].’ 

Consider the rule described by the parameters (as5’s) in Figure 10.2, for the 3-agent 
case. The reader should first verify the following. If each agent in some set $ C N 
were to have a preference peak at 1, while each remaining agent (in N \S) were to have 
a preference peak at 0, then the rule would choose as as the outcome. In this sense, the 
ds parameters reflect a (nonanonymous) degree of compromise at extreme preference 
profiles, analogous to the y,, parameters of Theorem 10.2. 

Without the anonymity condition, some agents — more generally some coalitions of 
agents — are more powerful than others. To see this, consider the profile of preferences 
represented in Figure 10.2 with peaks p,, po, p3. Following condition (iii) of Defi- 
nition 10.3, calculate min{ays, p; : i € S} for each S C N. Beginning with the three 


7 To avoid potential confusion, we point out that, in some of the literature, the term generalized median voter 
scheme also refers to such rules. 
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Figure 10.2. An example of a generalized median voter scheme for n = 3. 


singleton coalitions of the form S$ = {i}, these values are a1, a2, and a3, because each 
pi is above that agent’s corresponding a;;. (For peak p, the third value would have 
been p% instead.) Since the g.m.v.s. eventually chooses the maximum of these kinds of 
values (after we also check larger coalitions), agent 3 can be said to have more power 
than the other two agents, as a singleton. A large a3 corresponds to more instances 
in which agent 3’s peak is a candidate outcome for this rule. A small a; corresponds 
to more instances in which agent 1 has no impact on the outcome (i.e., whenever 
Pi > Of1}). 

On the other hand, we also need to calculate these minimum-values for larger 
coalitions. For the pairs of agents {1, 2}, {1, 3}, and {2, 3}, these values are ay1,2), 1, 
and p2 respectively. Coalition {1, 2} is the weakest two-agent coalition in the sense that 
they have the lowest as. After checking S = 4 (which yields 0) and S$ = N (yielding a 
repetition of the value p2), we calculate the rule’s outcome to be the maximum of the 
2” values {0, a1, 2, 3, 11,2}, P1, P2, P2} we have obtained, which is a3). 

We close by stating the main result of this section. We omit its proof, which has 
much in common with the proof of Theorem 10.2. 


Theorem 10.4 Avrule f is strategy-proof and onto if and only if it is a general- 
ized median voter scheme. 


10.2.2 Application to Public Good Cost Sharing 


Consider a group of n agents who have access to a machine that can convert their labor 
into some public good. Specifically, suppose that the machine requires the simultaneous 
labor of all n agents in order to work. The agents are free to jointly decide how many 
hours of labor, £, to work. Implicit is the requirement that each agent work for @ hours, 
however, since the machine requires all n agents’ labor simultaneously. After 2 hours of 
labor, the machine outputs y = Y(£) units of some public good, where the production 
function Y is assumed to be an increasing and strictly concave function, with Y(0) = 0. 

Different agents may have different preferences over how much labor they should 
provide, in exchange for the public good. Let us suppose that we know nothing about 
their preferences, other than the fact that they are represented by some utility function 
u;(, y) which is strictly increasing in y, strictly decreasing in @, and is quasi-concave.® 
See Figure 10.3. 

In this environment, a rule takes as input the reported utility functions of the agents, 
subject only to the assumptions we have made. It then gives as output a single labor 
requirement € = f(u1,...,u,). Each agent is then required to provide £ units of labor, 


8 The function u() is quasi-concave if, at each (€, y), the upper contour set {(€’, y’): u(’, y’) > u(E, y)} is convex. 
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Figure 10.3. An agent with utility function u most prefers the outcome (y, £); one with u’ 
prefers (y’, £’). 


and they enjoy Y(£) units of output as a reward. What rules are strategy-proof and 
onto? 

By assumption, outcomes may only be attained along the graph of Y. Because of 
the assumptions on Y and on preferences, it is clear that agents have single-peaked 
preferences over this consumption space. It follows that any strategy-proof, onto rule 
for this environment is a generalized median voter schemes operating along the graph 
of Y. 

Proving this is not difficult, but involves some technical details that we omit. First 
the outcome space is not bounded as we assumed before, although it would certainly be 
reasonable to bound it by assumption. Second, the preference domain here should be 
verified to yield all the single-peaked preferences necessary to characterize generalized 
median voter schemes; e.g., we used symmetric single-peaked preferences to construct 
the proof of Theorem 10.2. Third, one should demonstrate that a strategy-proof rule in 
this environment is invariant to utility information away from the graph of Y. We leave 
it to the interested reader to verify our claim despite these technicalities. 

In this kind of problem, it may be reasonable to add additional requirements to 
a rule. One that we address is the requirement that an agent should be better off as 
part of this decision-making group than if he were simply to walk away. Formally, if 
this public good technology did not exist, each agent would provide no labor (¢ = 0), 
and would enjoy none of the public good (y = 0). We say a tule is individually 
rational if for all U = (uy,...,u,) and 1 >i >n, we have u;(f(U), Y(f(U))) = 
u (0, 0). 

What strategy-proof and onto rules satisfy individual rationality? In terms of our 
earlier model, where agents have single-peaked preferences on [0, 1], that question 
translates as follows: What g.m.v.s. has the property that, for any preference profile, 
each agent (weakly) prefers the chosen outcome to the outcome x = 0? 

The answer is that there is a unique such rule. As an exercise, we leave it to the 
reader to show that the rule that chooses the minimum peak is the unique strategy-proof, 
onto rule that satisfies this individual rationality condition. In terms of this public good 
model, this corresponds to asking each agent their most preferred labor level £, and 
choosing the minimum. 
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10.3 House Allocation Problem 


The House allocation problem is a model for understanding the allocation of indivisible 
goods. It involves a set N of n agents, each owning a unique house and a strict preference 
ordering over all n houses. The objective is to reallocate the houses among the agents 
in an appropriate way. A modern version of the same would replace houses by kidneys. 

While any possible (strict) preference ordering over the homes is permitted, the set of 
preferences over allocations is restricted. In particular, an agent is indifferent between 
all allocations that give her the same house. Therefore the Gibbard—Satterthwaite 
Theorem does not apply in this setting. 

One could select an allocation of homes ina variety of ways, perhaps so as to optimize 
some function of the preferences and then investigate if the resulting allocation rule 
is strategy-proof. However, this ignores an important feature not present in earlier 
examples. In this environment, agents control the resources to be allocated. Therefore 
an allocation can be subverted by a subset of agents who might choose to break away 
and trade among themselves. For this reason it is natural to focus on allocations that 
are invulnerable to agents opting out. 

Number each house by the number of the agent who owns that house. An allocation 
is an n vector a whose ith component, a;, is the number of the house assigned to agent 
i. If a is the initial allocation then a; = i. For an allocation to be feasible, we require 
that a; # a; for all i # j. The preference ordering of an agent i will be denoted >; 
and x >; y will mean that agent i ranks house x above house y. Denote by A the set 
of all feasible allocations. For every S C N let A(S) = {z € A: z; € S Vi € S} denote 
the set of allocations that can be achieved by the agents in S trading among themselves 
alone. Given an allocation a € A, a set S of agents is called a blocking coalition (for 
a) if there exists a z € A(S) such that for all i € S either z; >; a; or z; = a; and for 
at least one j € S we have that z; >; a;. A blocking coalition can, by trading among 
themselves, receive homes that each strictly prefers (or is equivalent) to the home she 
receives under a, with at least one agent being strictly better off. The set of allocations 
that is not blocked by any subset of agents is called the core. 

The reader will be introduced to the notion of the core in Chapter 15 (Section 15.2) 
where it will be defined for a cooperative game in which utility is transferable via 
money (a TU game). The house allocation problem we consider is an example of a 
cooperative game with nontransferable utility (an NTU game). The definition of the 
core offered here is the natural modification of the notion of TU core to the present 
setting. 

The theorem below shows the core to be nonempty. The proof is by construction 
using the top trading cycle algorithm (TTCA). 


Definition 10.5 (Top Trading Cycle Algorithm) Construct a directed graph 
using one vertex for each agent. If house j is agent i’s kth ranked choice, in- 
sert a directed edge from i to j and color the edge with color k. An edge of 
the form (i, i) will be called a loop. First, identify all directed cycles and loops 
consisting only of edges colored 1. The strict preference ordering implies that the 
set of such cycles and loops is node disjoint. Let N; be the set of vertices (agents) 
incident to these cycles. Each cycle implies a sequence of swaps. For example, 
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suppose i; > i2 > i3 — --- — i, is one such cycle. Give house i, to agent i,, 
house i, to agent 7,_;, and so on. After all such swaps are performed, delete all 
edges colored 1. Repeat with the edges colored 2 and call the corresponding set 
of vertices incident to these edges N2, and so on. The TTCA yields the resulting 
matching. 


This algorithm is used to prove the following result. 


Theorem 10.6 The core of the house allocation problem consists of exactly one 
matching. 


PROOF We prove that if a matching is in the core, it must be the one returned 
by the TTCA. 

Under the TTCA, each agent in N, receives his favorite house, i.e., the house 
ranked first in his preference ordering. Therefore, N; would form a blocking 
coalition to any allocation that does not assign to all of those agents the houses 
they would receive under the TTCA. That is, any core allocation must assign N; 
to houses just as the TTCA assigns them. 

Given this fact, the same argument applies to Nz: Under the TTCA, each agent 
in N> receives his favorite house not including those houses originally endowed 
by agents in N,. Therefore, if an allocation is in the core and the agents in N, 
are assigned each other’s houses, then agents in Nz must receive the same houses 
they receive under the TTCA. 

Continuing the argument for each N; proves that if an allocation is in the core, 
then it is the one determined by the TTCA. This proves that there is at most one 
core allocation. 

To prove that the TTCA allocation is in the core, it remains to be shown that 
there is no other blocking coalition § C N. This is left to the reader. 


To apply the TTCA, one must know the preferences of agents over homes. Do 
they have an incentive to truthfully report these? To give a strongly positive answer 
to this question, we first associate the TTCA with its corresponding direct revelation 
mechanism. Define the Top Trading Cycle (TTC) Mechanism to be the function 
(mechanism) that, for each profile of preferences, returns the allocation computed by 
the TTCA. 


Theorem 10.7 The TTC mechanism is strategy-proof. 


PROOF Let z bea profile of preference orderings and a the allocation returned 
by TTCA when applied to 2. Suppose that agent j € N; for some k misreports 
her preference ordering. Denote by z’ the new profile of preference orderings. 
Let a’ the allocation returned by TTCA when applied to x’. If the TTCA is 
not strategy-proof a, >' a;. Observe that a; = aj for alli € bie ; N,. Therefore, 
a,eN\ (es N,}. However, the TTCA chooses a; to be agent i’s top ranked 


r 


choice from N \ {Ue : N,} contradicting the fact that a} >! qj. 


If we relax the requirement that preferences be strict, what we had previously called 
a blocking set is now called a weakly blocking set. What we had previously called the 
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core is now called the strict core. With indifference, a blocking set S is one where all 
agents in S are strictly better off by trading among themselves. Note the requirement 
that all agents be strictly better off. The core is the set of allocations not blocked by 
any set S. 

When preferences are strict, every minimal weakly blocking set is a blocking set. To 
see this, fix a weakly blocking set S. An agent in S who is not made strictly better off 
by trade among agents in S must have been assigned their own home. Remove them 
from S. Repeat. The remaining agents must all be allocated houses that make them 
strictly better off. Hence, when preferences are strict the core and strict core coincide. 
With indifference permitted, the strict core can be different from the core. In fact, there 
are examples where the strict core is empty and others where it is not unique. Deciding 
emptiness of the strict core is polynomial in |N|. 

Another possible extension of the model is to endow the agents with more than 
one good. For example, a home and a car. Clearly, if preferences over pairs of goods 
are sufficiently rich, the core can be empty. It turns out that even under very severe 
restrictions the core can still be empty. For example, when preferences are separable, 
Le., one’s ranking over homes does not depend on which car one has. 


10.4 Stable Matchings 


The stable matching problem was introduced as a model of how to assign students to 
colleges. Since its introduction, it has been the object of intensive study by both com- 
puter scientists and economists. In computer science it used as vehicle for illustrating 
basic ideas in the analysis of algorithms. In economics it is used as a stylized model 
of labor markets. It has a direct real-world counterpart in the procedure for matching 
medical students to residencies in the United States. 

The simplest version of the problem involves a set M of men and a set W of women. 
Each m € M has a strict preference ordering over the elements of W and each w « W 
has a strict preference ordering over the men. As before the preference ordering of 
agent i will be denoted >; and x >; y will mean that agent i ranks x above y. A 
matching is an assignment of men to women such that each man is assigned to at most 
one woman and vice versa. We can accommodate the possibility of an agent choosing 
to remain single as well. This is done by including for each man (woman) a dummy 
woman (man) in the set W (M) that corresponds to being single (or matched with 
oneself). With this construction we can always assume that |M| = |W. 

As in the house allocation problem a group of agents can subvert a prescribed 
matching by opting out. In a manner analogous to the house allocation problem, we 
can define a blocking set. A matching is called unstable if there are two men m, m’ 
and two women w, w’ such that 


(i) m is matched to w, 
(ii) m’ is matched to w’, and 
(iii) w! >, wand m >y m’ 


The pair (m, w’) is called a blocking pair. A matching that has no blocking pairs is 
called stable. 
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Example 10.8 The preference orderings for the men and women are shown in 
the table below 


>m >m m wm Kw mw, 
w2 Ww) WI mM m3 mM, 
Wi W3 w2 m3 my, m3 


W3 Ww2 W3 m2 m2 my 


Consider the matching {(77 1, w 1), (™2, W2), (m3, w3)}. This is an unstable match- 
ing since (771, W2) is a blocking pair. The matching {(7,, w ), (m3, W2), (m2, W3)}, 
however, is stable. 


Given the preferences of the men and women, is it always possible to find a sta- 
ble matching? Remarkably, yes, using what is now called the deferred acceptance 
algorithm. We describe the male-proposal version of the algorithm. 


Definition 10.9 (Deferred Acceptance Algorithm, male-proposals) First, each 
man proposes to his top-ranked choice. Next, each woman who has received at 
least two proposals keeps (tentatively) her top-ranked proposal and rejects the rest. 
Then, each man who has been rejected proposes to his top-ranked choice among 
the women who have not rejected him. Again each woman who has at least two 
proposals (including ones from previous rounds) keeps her top-ranked proposal 
and rejects the rest. The process repeats until no man has a woman to propose to 
or each woman has at most one proposal. At this point the algorithm terminates 
and each man is assigned to a woman who has not rejected his proposal. Notice 
that no man is assigned to more than one woman. Since each woman is allowed 
to keep only one proposal at any stage, no woman is assigned to more than one 
man. Therefore the algorithm terminates in a matching. 


We illustrate how the (male-proposal) algorithm operates using Example 10.8 above. 
In the first round, m; proposes to w2, m2 to w;, and m3 to w,. At the end of this round 
w is the only woman to have received two proposals. One from m3 and the other from 
m . Since she ranks m3 above m2, she keeps m3 and rejects m2. Since m3 is the only 
man to have been rejected, he is the only one to propose again in the second round. This 
time he proposes to w3. Now each woman has only one proposal and the algorithm 
terminates with the matching {(m 1, w2), (m2, w3), (m3, W2)}. It is easy to verify that 
the matching is stable and that it is different from the one presented earlier. 


Theorem 10.10 The male propose algorithm terminates in a stable matching. 


PROOF Suppose not. Then there exists a blocking pair (m,, w,) with m, matched 
to w2, say, and w, matched to myo. Since (m1, w)) is blocking and w; >, W2, in 
the proposal algorithm, m, would have proposed to w, before w2. Since m, was 
not matched with w, by the algorithm, it must be because w) received a proposal 
from a man that she ranked higher than m,. Since the algorithm matches her to 
mz it follows that mz >, m1. This contradicts the fact that (7, w ) is a blocking 
pair. 
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One could just as well have described an algorithm where the women propose and 
the outcome would also be a stable matching. Applied to the example above, this would 
produce a stable matching different from the one generated when the men propose. 
Thus, not only is a stable matching guaranteed to exist but there can be more than 1. If 
there can be more than one stable matching, is there a reason to prefer one to another? 
Yes. To explain why, some notation. 

Denote a matching by j. the woman assigned to man m in the matching jz is denoted 
ium). Similarly, 4(w) is the man assigned to woman w. A matching ju is male-optimal 
if there is no stable matching v such that v(m) >, 4(m) or v(m) = j(m) for all m with 
v(j) >; (J) for at least one j € M. Similarly define female-optimal. 


Theorem 10.11 The stable matching produced by the (male-proposal) Deferred 
Acceptance Algorithm is male-optimal. 


PROOF Let jz be the matching returned by the male-propose algorithm. Suppose 
{4 is not male optimal. Then, there is a stable matching v such that v(m) >», (m) 
or v(m) = p(m) for all m with v(j) >; wCj) for at least one j € M. Therefore, in 
the application of the proposal algorithm, there must be an iteration where some 
man j proposes to v(j) before j1(j) since v(j) >; w(/) and is rejected by woman 
v(j). Consider the first such iteration. Since woman v(j) rejects 7 she must have 
received a proposal from a man i she prefers to man j. Since this is the first 
iteration at which a male is rejected by his partner under v it follows that man 
i ranks woman v(j) higher than v(i). Summarizing, i >.) j and v(j) >; vi) 
implying that v is not stable, a contradiction. 


Clearly one can replace the word “male” by the word “female” in the statement 
of the theorem above. It is natural to ask if there is a stable matching that would be 
optimal with respect to both men and women. Alas, no. The example above has two 
stable matchings: one male optimal and the other female optimal. At least one female 
is strictly better off under the female optimal matching than the male optimal one and 
no female is worse off. A similar relationship holds when comparing the two stable 
matchings from the point of view of the men. 

A stable matching is immune to a pair of agents opting out of the matching. We 
could be more demanding and ask that no subset of agents should have an incentive 
to opt out of the matching. Formally, a matching x’ dominates a matching w if there 
is a set SC MUW such that for all m, w € S, both (i) p’(m), w’(w) € S and (ii) 
(Mm) >m “(m) and p'(w) >» “(w). Stability is a special case of this dominance 
condition when we restrict attention to sets S consisting of a single couple. The set 
of undominated matchings is called the core of the matching game. The next result is 
straightforward. 


Theorem 10.12 The core of the matching game is the set of all stable matchings. 


Thus far we have assumed that the preference orderings of the agents is known to 
the planner. Now suppose that they are private information to the agent. As before 
we can associate a direct revelation mechanism with an algorithm for finding a stable 
matching. 
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Theorem 10.13 The direct mechanism associated with the male propose algo- 
rithm is strategy-proof for the males. 


PROOF Suppose not. Then there is a profile of preferences 7 = (>mn,,>m); 
++, >m,) for the men, such that man m,, say, can misreport his preferences and 
obtain a better match. To express this formally, let uw be the stable matching 
obtained by applying the male proposal algorithm to the profile 2. Suppose that 
m, reports the preference ordering >, instead. Let v be the stable matching that 
results when the male-proposal algorithm is applied to the profile 2! = (>,, 
>~my.+++)>m,)- For a contradiction, suppose v(m) >», (m1). For notational 
convenience we will write a >,, b to mean that a >,, b ora = b. 

First we show that m, can achieve the same effect by choosing an ordering > 
where woman v(m;) is ranked first. Let 77 = (=, >mo> +++) >m,)- Knowing that 
v is stable with respect to the profile 2! we show that it is stable with respect to 
the profile 2. Suppose not. Then under the profile 2? there must be a pair (m, w) 
that blocks v. Since v assigns to m, its top choice with respect to 7, m, cannot 
be part of this blocking pair. Now the preferences of all agents other than m are 
the same in zr! and 7”. Therefore, if (m, w) blocks v with respect to the profile 
7, it must block v with respect to the profile !, contradicting the fact that v is 
a stable matching under z!. 

Let A be the male propose stable matching for the profile 2”. Since v is a stable 
matching with respect to the profile 27. As A is male optimal with respect to the 
profile m7, it follows that A(m,) = v(m)). 

Thus we can assume that v(m) is the top-ranked woman in the ordering >,. 
Next we show that the set B = {m;: (mj) >m, v(m;)} is empty. This means that 
all men, not just m,, are no worse off under v compared to jz. Since v is stable 
with respect to the original profile, 2 this contradicts the male optimality of yu 
and completes the proof. 

Suppose B 4 ¥. Therefore, when the male proposal algorithm is applied to the 
profile z', each m ; € B is rejected by their match under ju, i.e., (mj). Consider 
the first iteration of the proposal algorithm where some m ; is rejected by ju(m ;). 
This means that woman j1(m ;) has a proposal from man m, that she ranks higher, 
1.€., Mk >(m;) Mj. Since mz was not matched to (mm ;) under jx it must be that 
(my) >m, Ln ;). Hence m, € B, otherwise 


wn ;) = meV(ME) =m, HME) > my, LN ;), 


which is a contradiction. 

Since m, € B and m, has proposed to j(m;) at the time man m,; proposes, 
it means that m, must have been rejected by jz(m,) prior to m; being rejected, 
contradicting our choice of m ;. 


The mechanism associated with the male propose algorithm is not strategy-proof for 
the females. To see why, it is enough to consider example. The male propose algorithm 
returns the matching {(71, w2), (m2, wW3), (m3, w1)}. In the course of the algorithm the 
only woman who receives at least two proposals is w;. She received proposals from 
mz and m3. She rejects mz who goes on to propose to w3 and the algorithm terminates. 
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Notice that w; is matched with her second choice. Suppose now that she had rejected 
m3 instead. Then m3 would have gone on to proposes to w2. Woman w 2 now has a 
choice between m, and m3. She would keep m3 and reject m,, who would go on to 
propose to w;. Woman w would keep m, over mz and in the final matching be paired 
with a her first-rank choice. 

It is interesting to draw an analogy between the existence of stable matchings and 
that of Walrasian equilibrium. We know (Chapter 6) that Walrasian equilibria exist. 
Furthermore, they are the solutions of a fixed point problem. In the cases when they can 
be computed efficiently it is because the set of Walrasian equilibria can be described 
by a set of convex inequalities. The same can be said of stable matchings. The set of 
stable matchings is fixed points of a nondecreasing function defined on a lattice. In 
addition, one can describe the set of stable matchings as the solutions to a set of linear 
inequalities. 


10.4.1 A Lattice Formulation 


We describe a proof of the existence of stable matchings using Tarski’s fixed point 
theorem. It will be useful to relax the notion of a matching. Call an assignment of 
women to men such that each man is assigned to at most one woman (but a woman 
may be assigned to more than one man) a male semimatching. The analogous object 
for women will be called a female semimatching. For example, assigning each man 
his first choice would be a male semimatching. Assigning each woman her third choice 
would be an example of a female semimatching. 

A pair of male and female semimatchings will be called a semimatching which we 
will denote by jz, v, etc. An example of a semi-matching would consist of each man 
being assigned his first choice and each woman being assigned her last choice. 

The woman assigned to the man m under the semi-matching jz will be denoted 
j4(m). If man m is assigned to no woman under pL, then (mm) = m. Similarly for w(w). 
Next we define a partial order over the set of semimatchings. Write w > v if 


(Gi) w(m) >m v(m) or wm) = (m) for all m € M and 
(ii) “(w) <y v(w) or u(w) = v(w) for all w € W. 


Therefore jz > v if all the men are better off under yz than in v and all the women are 
worse off under yz than in v. 

Next we define the meet and join operations. Given two semimatchings jz and v 
define 4 = uw V v as follows: 


(i) AGn) = w(m) if U(m) >m v(m) otherwise A4(m) = v(m), 
(ii) A(w) = e(w) if w(w) ~y v(w) otherwise A(w) = v(w). 


Define A’ = « A vas follows: 


(i) A’(m) = (in) if wm) <,, v(m) otherwise A(m) = v(m), 
(ii) A(w) = (w) if w(w) >» v(w) otherwise A(w) = v(w). 


With these definitions it is easy to check that the set of semimatchings forms a compact 
lattice. 
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Now define a function f on the set of semi-matchings that is nondecreasing. Given 
a semi-matching ju define f (jx) to be the following semi-matching: 


(i) f()(m) is man m’s most preferred woman from the set {w: m >y “(w),m = w(w)}. 
If this set is empty set f(u)(@m) = m. 

(ii) f()(w) is woman w’s most preferred man from the set {m: w > w(m), w = (n)}. 
If this set is empty set f(u)(w) = w. 


It is clear that f maps semi-matchings into semi-matchings. 


Theorem 10.14 There is a semi-matching tu such that f (4) = ww and that pr is 
a stable matching. 


PROOF We use Tarski’s theorem. It suffices to check that f is nondecreasing. 
Suppose pz > v. Pick any m € M. From the definition of >, the women are worse 
off under yz than in v. Thus 


{wim >y v(w)} C {wim >y ww} 


and so f(u)(m) >m f()Gn) or f(u)Gn) = f(v)Gn). A similar argument applies 
for each w € W. Thus f is nondecreasing. 

Since the conditions of Tarski’s theorem hold, it follows that there is a semi- 
matching jz such that f(z) = uw. We show that the semi-matching is a stable 
matching. 

By the definition of a semi-matching we have for every m € M, (m) single 
valued as is 4(w) for all w € W. To show that yu is a matching, suppose not. Then 
there is a pair 7m, m2 € M, say, such that w(m1) = wn2) = w*. Since f(j1) = 
it follows that w* is m,’s top-ranked choice in {w: m1 > » U(w), mm, = (w)} and 
my ’s top ranked choice in {w: m2 >» L(w), m2 = L(w)}. From this we deduce 
that ~(w*) = m3 where m,,m2 >” m3. However, m3 = l(w*) = f(u*)(w*), 
which is woman w*’s top-ranked choice in {m: w* >» (Cm), “W(m) = w*}. Since 
m,, My are members of this set, we get a contradiction. 

To show that the matching yz is stable suppose not. Then there must be a 
blocking pair (m*, w*). Let w’ = u(m*) and m’ = u(w*), m’ ~ m* and w* # 
w’. Since (m*, w*) is blocking, m* >,» m’ and w* >,* w’. Now w’ = w(m*) = 
ft (4)(m*), which is man m*’s top-ranked choice from {w :m* >, u(w),m* = 
iL(w)}. But this set contains w*, which is ranked higher by man m* than w’, a 
contradiction. 


10.4.2 The LP Formulation 


One can formulate the problem of finding a stable matching as the solution to a set of 
linear inequalities. For each man m and woman w let x» = 1 if man m is matched 
with woman w and zero otherwise. Then, every stable matching must satisfy the 
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following. 


bed vm € M 


wew 
Yo Xm =1 Ywe WwW 
meM 
Yo Xj + YS Xiw + Xmw <1 Vn ée M,we Ww 


Xmw = 0 VneM,wew 


Let P be the polyhedron defined by these inequalities. 

The first two constraints of P ensure that each agent is matched with exactly one 
other agent of the opposite sex. The third constraint ensures stability. To see why, 
suppose )7 5. Xmj = Land D0; _. 4, Xiw = 1. Then man m is matched to a woman, j 
that he ranks below w. Similarly, woman w is matched to a man she ranks below m. 
This would make the pair (m, w) a blocking pair. 


Theorem 10.15 P is the convex hull of all stable matchings. 


10.4.3 Extensions 


We have been careful to specify that preferences are strict. If we allow for indifference, 
Theorem 10.7 becomes false. This is because there are instances of the stable matching 
problem in which no male or female optimal stable matching exists. The other theorems 
stated above continue to hold in the presence of indifferences. 

We also limited ourselves to one-to-one matchings. There are situations where one 
side of the market wishes to match with more than one agent. The college admissions 
market is the classic example. Each student can be assigned to at most one college 
but each college can be assigned to many students. In this more general setup colleges 
will have preferences over subsets of students. In the absence of any restrictions on 
these preferences a stable matching need not exist. One restriction on preferences for 
which the results above carry over with no change in statement or proof is the quota 
model. Each college has a strict preference ordering over the students and a quota r 
of students it wishes to admit. Consider two subsets, S and 7, of students of size r 
that differ in exactly one student. The college prefers the subset containing the more 
preferred student. 

A third extension is to relax the bipartite nature of the stable matching problem. 
The nonbipartite version is called the stable roommates problem. Suppose that a set 
of N individuals such that || is even. A matching in this setting is a partition of N 
into disjoint pairs of individuals (roommates). Each individual has a strict preference 
ordering over the other individuals that they would like to be paired with. As before, 
a matching is unstable if there exists a pair who prefer each other to the person they 
are matched with. Such a pair is called blocking. Unlike the stable matching problem, 
stable roommates need not exist as the following four person example illustrates. 


262 MECHANISM DESIGN WITHOUT MONEY 


~]1 >2 3 >4 
3 1 2 2 
2 3 1 1 
4 4 4 4 


Each column lists the preference ordering that one agent has over the others. A 
matching that pairs agent 1 with agent 4 will always be blocked by the pair (1, 2). A 
matching that pairs 2 with 4 will be blocked by (2, 3). A matching that pairs 3 and 4 
will be blocked by (3, 1). 

An O(|N|?) algorithm to determine if a stable matching exists is known. One 
can also associate a collection of linear inequalities with the stable roommates prob- 
lem such that the system is feasible if and only if a stable roommates solution 
exists. 


10.5 Future Directions 


While the models in this chapter have been studied and extended in a variety of ways, 
there are plenty of open questions for the creative researcher. 

One direction of future research on the single-peaked preference model of 
Section 10.2 would be to consider choosing multiple alternatives (locations) on an 
interval (or more general graph) when agents’ preferences are single-peaked with 
respect to the one location that is closest to his peak. As an idealized example, 
when downloading files on the Internet one cares only about the location (dis- 
tance) of the closest “mirror” site. If a planner can elicit preferences to choose 
the location of k mirrors on a network, how can this be done in a strategy-proof 
way? 

As for the house allocation model of Section 10.3 and the stable matching model of 
Section 10.4, observe that both models are static in nature. Yet, there are a variety of 
dynamic environments that resemble these models in important ways. As an example, 
take the problem of allocating kidneys. Until quite recently those needing a kidney 
transplant would have to wait in a queue (the wait list) for an available kidney that 
would be an appropriate “fit” or else find a donor fulfilling the appropriate medical 
conditions. 

More recently, however, exchange systems have been implemented which al- 
low kidney patients to “swap” their incompatible (but willing) friends and rela- 
tives who are willing to donate a kidney. (Suppose that Alice needs a kidney, 
and her incompatible friend Bob is willing to donate; also suppose that Carmina 
and Dijen are in a similar situation. If Alice and Dijen are compatible, and if 
Carmina and Bob are compatible, then a compatible “swap” can be arranged.) 
Static versions of such a model have been analyzed by Roth, Sénmez, and Unver 
(2004). 

Those authors and others have developed a substantial literature around this impor- 
tant problem. If donors and recipients arrive dynamically to such a setting, how should 
swaps be arranged? 
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10.6 Notes and References 


The canonical results for the single-peaked preference model are provided by 
Moulin (1980), who proved Theorems 10.2 and 10.4 with the additional requirement 
that rules take agents’ peaks as their only input. Ching (1997) subsequently showed 
that this requirement is redundant when a rule is strategy-proof and onto. 

Border and Jordan (1983) generalize these conclusions to multidimensional models 
where the outcome space is R*. They restrict attention to separable preferences, i.e., 
under the assumption that an agent’s (relative) preferences over any one dimension 
are fixed, as we vary any other dimensions of the altnerative. For example with k = 3, 
if (%1, x2, X3) =; (], x2, X3) then separability would imply (x1, y2, y3) =i (%}, Y2, ¥3)- 
Border and Jordan show that a strategy-proof, onto rule for separable preferences 
must be decomposable into k (possibly different) one-dimensional rules. Of course, 
these one-dimensional rules must be generalized median voter schemes. For fur- 
ther reference on such generalizations, one should consult the survey of Barbera 
(2001). 

Another direction in which these results have been generalized pertains to situations 
in which agents have single-peaked preferences on graphs. Schummer and Vohra (2004) 
obtain two types of result, depending on whether the graph contains any cycle. Finally, 
the book of Austen-Smith and Banks (2005). contains more details on the key results 
of this literature, and a proof of Theorem 10.4. 

The house allocation problem was introduced by Herbert Scarf and Lloyd Shapley 
(1974). The TTCA is attributed by these authors to David Gale. The idea that the house 
allocation problem can be used as a model for kidney exchanges is discussed in Roth 
et al. (2004). 

The stable matching problem was introduced by David Gale and Lloyd Shapley 
(1962). The first algorithm for finding a stable matching was developed a decade 
earlier in 1951 to match interns to hospitals (Stalnaker, 1953). The intrinsic appeal of 
the model has inspired three books. The first, by Donald Knuth (1976) uses the stable 
matching problem as a vehicle to illustrate some of the basic ideas in the analysis of 
algorithms. The book by Gusfield and Irving (1989) is devoted to algorithmic aspects 
of the stable matching problem and some of its relatives. On the economics side, the 
book by Roth and Sotomayor (1991) gives a complete game theoretic treatment of the 
stable matching problem as well as some of its relatives. 

The lattice theoretic treatment of the stable matching problem goes back to Knuth 
(1976). The proof of existence based on Tarski’s fixed point theorem is due to Adachi 
(2000). In fact, the proposal algorithm is exactly one of the algorithms for finding a 
fixed point when specialized to the case of stable matchings. 

The linear programming formulation of the stable matching problem is due to Vande 
Vate (1989). The extension of it to the stable room mates problem can be found in Teo 
and Sethuraman (1998). Gusfield and Irving (1989) give a full algorithmic account of 
the stable roommates problem. 

In parallel, studies have been made of matching models where monetary transfers 
are allowed. This has inspired models that unify both the stable matching problem as 
well as matching problems where monetary transfers are allowed. Descriptions can be 
found in Fleiner (2003) and Hatfield and Milgrom (2005). 
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Exercises 


10.1 To what extent is Lemma 10.1 sensitive to the richness of the preference domain? 
For example, does the result hold if the preference domain is even smaller, e.g., 
containing only symmetric single-peaked preferences? 

10.2 Suppose that an anonymous rule described in Theorem 10.2 has parameters 
(Vm). Express this rule as a generalized median voter scheme with parameters 
(@s5)scn- 

10.3 Suppose that a rule f is strategy-proof and onto, but not necessarily anonymous. 
Fix the preferences of agents 2 through n, (>2,...,>n), and denote the outcomes 
obtainable by agent 1 as 


O= f(-,>0,...,2n) = {x € 10,1]: >1€ Rst. f(>1,>2,..., =n)}- 


10.4 
10.5 


10.6 
10.7 


10.8 
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Show that O = [a, b] for some a, b € [0, 1] (without appealing directly to Theo- 
rem 10.4). 


Prove Theorem 10.4. 


For the case of three agents, generalize Theorem 10.2 to a 3-leaved tree. Specifi- 
cally, consider a connected noncyclic graph (i.e., a tree) with exactly three leaves, 
£1, €, £3. Preferences over such a graph are single-peaked if there is a peak p; such 
that for any x in the graph, and any y in the (unique shortest) path from x to pj, 
y >; x. The concepts of strategy-proofness, onto, and anonymity generalize in the 
straightforward way to this setting. Describe all the rules that satisfy these condi- 
tions for the case n = 3. (Hint: first show that when all agents’ peaks are restricted 
to the interval [¢;, €2], the rule must behave like one described in Theorem 10.2.) 
For the nonanonymous case with n > 3, see Schummer and Vohra (2004). 


Prove that the TTCA returns an outcome in the core of the house allocation game. 


The TTC mechanism is immune to agents misreporting their preferences. Is it 
immune to agents misreporting the identity of their houses? Specifically, suppose 
a subset of agents trade among themselves first before participating in the TTC 
mechanism. Can all of them be strictly better off by doing so? 


Consider an instance of the stable matching problem. Let v be a matching (not 
necessarily stable) and 4 the male optimal stable matching. Let B = {m: v(m) >” 
yu(m)}. Show that if B 4 @ then there is am’ ¢ B and woman w such that (m, w) is 
a blocking pair for v. 


CHAPTER 11 


Combinatorial Auctions 


Liad Blumrosen and Noam Nisan 


Abstract 


In combinatorial auctions, a large number of items are auctioned concurrently and bidders are allowed 
to express preferences on bundles of items. This is preferable to selling each item separately when 
there are dependencies between the different items. This problem has direct applications, may be 
viewed as a general abstraction of complex resource allocation, and is the paradigmatic problem on 
the interface of economics and computer science. We give a brief survey of this field, concentrating 
on theoretical treatment. 


11.1 Introduction 


A large part of computer science as well as a large part of economics may be viewed as 
addressing the “allocation problem”: how should we allocate “resources” among the 
different possible uses of these resources. An auction of a single item may be viewed 
as a simple abstraction of this question: we have a single indivisible resource, and two 
(or more) players desire using it — who should get it? Being such a simple and general 
abstraction explains the pivotal role of simple auctions in mechanism design theory. 

From a similar point of view, “combinatorial auctions” abstract this issue when mul- 
tiple resources are involved: how do I allocate a collection of interrelated resources? 
In general, the “interrelations” of the different resources may be combinatorially com- 
plex, and thus handling them requires effective handling of this complexity. It should 
thus come as no surprise that the field of “combinatorial auctions” — the subject of 
this chapter — is gaining a central place in the interface between computer science and 
economics. 


11.1.1 Problem Statement 


The combinatorial auction setting is formalized as follows: There is a set of m indivisible 
items that are concurrently auctioned among n bidders. For the rest of this chapter we 


267 


268 COMBINATORIAL AUCTIONS 


will use n and m in this way. The combinatorial character of the auction comes from 
the fact that bidders have preferences regarding subsets — bundles — of items. Formally, 
every bidder 7 has a valuation function v; that describes his preferences in monetary 
terms: 


Definition 11.1 A valuation v is a real-valued function that for each subset S of 
items, v(S) is the value that bidder i obtains if he receives this bundle of items. 
A valuation must have “free disposal,” i.e., be monotone: for S C T we have that 
v(S) < v(T), and it should be “normalized”: v(@) = 0. 


The whole point of defining a valuation function is that the value of a bundle of items 
need not be equal to the sum of the values of the items in it. Specifically for sets S and 
T, SOT = Q@, we say that S and T are complements to each other (in v) if voS U T) > 
v(S) + v(T), and we say that S and T are substitutes if viS U T) < v(S) + v(T). 

Note that implicit in this definition are two assumptions about bidder preferences: 
first, we assume that they are “quasi-linear” in the money; i.e., if bidder i wins bundle 
S and pays a price of p for it then his utility is v;(S) — p. Second, we assume that there 
are “no externalities”; i.e., a bidder only cares about the item that he receives and not 
about how the other items are allocated among the other bidders. 


Definition 11.2. An allocation of the items among the bidders is S),..., S, 
where S$; 1S; = @ for every i # j. The social welfare obtained by an alloca- 
tion is }°; v;(S;). A socially efficient allocation (among bidders with valuations 
V1,.--, Up) is an allocation with maximum social welfare among all allocations. 


In our usual setting the valuation function vu; of bidder i is private information — 
unknown to the auctioneer or to the other bidders. Our usual goal will be to design a 
mechanism that will find the socially efficient allocation. What we really desire is a 
mechanism where this is found in equilibrium, but we will also consider the partial goal 
of just finding the optimal allocation regardless of strategic behavior of the bidders. 
One may certainly also attempt designing combinatorial auctions that maximize the 
auctioneer’s revenue, but much less is known about this goal. 


There are multiple difficulties that we need to address: 


¢ Computational complexity: The allocation problem is computationally hard (NP- 
complete) even for simple special cases. How do we handle this? 

¢ Representation and communication: The valuation functions are exponential size objects 
since they specify a value for each bundle. How can we even represent them? How do 
we transfer enough information to the auctioneer so that a reasonable allocation can be 
found? 

¢ Strategies: How can we analyze the strategic behavior of the bidders? Can we design 
for such strategic behavior? 
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The combination of these difficulties, and the subtle interplay between them is what 
gives this problem its generic flavor, in some sense encompassing many of the issues 
found in algorithmic mechanism design in general. 


11.1.2 Some Applications 


In this chapter we will undertake a theoretical study and will hardly mention spe- 
cific applications. More information about various applications can be found in the 
references mentioned in Section 11.8. Here we will shortly mention a few. 

“Spectrum auctions,” held worldwide and, in particular, in the united states, have 
received the most attention. In such auctions a large number of licenses are sold, each 
license being for the use of a certain band of the electromagnetic spectrum in a certain 
geographic area. These licenses are needed, for example, by cell-phone companies. 
To give a concrete example, let us look at the next scheduled auction of the FCC at 
the time of writing (number 66), scheduled for August 2006. This auction is intended 
for “advanced wireless services” and includes 1,122 licenses, each covering a 10- or 
20-MHz spectrum band (somewhere in the 1.7-GHz or 2.1-GHz frequency range) over 
a geographic area that contains a population of between 0.5 million to 50 million. The 
total of the minimum bids for all licenses is over 1 billion dollars. Generally speaking, 
in such auctions bidders desire licenses covering the geographic area that they wish to 
operate in, with sufficient bandwidth. Most of the spectrum auctions held so far escaped 
the full complexity of the combinatorial nature of the auction by essentially holding 
a separate auction for each item (but usually in a clever simultaneous way). In such 
a format, bidders could not fully express their preferences, thus leading, presumably, 
to suboptimal allocation of the licenses. In the case of FCC auctions, it has thus been 
decided to move to a format that will allow “combinatorial bidding,” but the details are 
still under debate. 

Another common application area is in transportation. In this setting the auction 
is often “reversed” — a procurement auction — where the auctioneer needs to buy 
the set of items from many bidding suppliers. A common scenario is a company 
that needs to buy transportation services for a large number of “routes” from various 
transportation providers (e.g., trucking or shipping companies). For each supplier, the 
cost of providing a bundle of routes depends on the structure of the bundle as the cost of 
moving the transportation vehicles between the routes in the bundle needs to be taken 
into account. Several commercial companies are operating complex combinatorial 
auctions for transportation services, and commonly report savings of many millions of 
dollars. 

The next application we wish to mention is conceptual, an example demonstrat- 
ing that various types of problems may be viewed as special cases of combinatorial 
auctions. Consider a communication network that needs to supply multiple “con- 
nection requests” — each requesting a path between two specified nodes in the net- 
work, and offering a price for such a path. In the simplest case, each network edge 
must be fully allocated to one of the requests, so the paths allocated to the requests 
must be edge-disjoint. Which requests should we fulfill, and which paths should we 
allocate for it? We can view this as a combinatorial auction: the items sold are the 
edges of the network. The players are the different requests, and the valuation of a 
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request gives the offered price for any bundle of edges that contains a path between the 
required nodes, and 0 for all other bundles. 


11.1.3 Structure of This Chapter 


We start our treatment of combinatorial auctions, in Section 11.2, by leaving aside 
the issue of representation and concentrating on bidders with simple “single-minded” 
valuations. For these bidders we address the twin questions of the computational com- 
plexity of allocation and strategic incentive compatibility. The rest of the chapter then 
addresses general valuations. Section 11.3 lays out mathematical foundations and in- 
troduces the notion of Walrasian equilibrium and its relation to the linear programming 
relaxation of the problem. Section 11.4 describes a first approach for computation- 
ally handling general valuations: representing them in various “bidding languages.” 
Section 11.5 describes a second approach, that of using iterative auctions which re- 
peatedly query bidders about their valuations. In Section 11.6 we show the limitations 
of the second approach, pointing out an underlying communication bottleneck. Section 
11.7 studies a natural widely used family of iterative auctions — those with ascend- 
ing prices. Bibliographic notes appear in Section 11.8, followed by a collection of 
exercises. 


11.2 The Single-Minded Case 


This section focuses on the twin goals of computational complexity and strategic 
behavior, while leaving out completely the third issue of the representational complexity 
of the valuation functions. For this, we restrict ourselves to players with very simple 
valuation functions which we call “single-minded bidders.” Such bidders are interested 
only in a single specified bundle of items, and get a specified scalar value if they get 
this whole bundle (or any superset) and get zero value for any other bundle. 


Definition 11.3 A valuation v is called single minded if there exists a bundle of 
items S* and a value v* € W* such that v(S) = v* for all § D S*, and v(S) = 0 
for all other S. A single-minded bid is the pair (S*, v*). 


Single-minded valuations are thus very simply represented. The rest of this section 
assumes as common knowledge that all bidders are single minded. 


11.2.1 Computational Complexity of Allocation 


Let us first consider just the algorithmic allocation problem among single-minded 
bidders. Recall that in general, an allocation gives disjoint sets of items S; to each 
bidder i, and aims to maximize the social welfare }°; v;(S;). In the case of single- 
minded bidders whose bids are given by (S¥, v;), it is clear that an optimal allocation 
can allocate to every bidder either exactly the bundle he desires S$; = S* or nothing at 
all S; = @. The algorithmic allocation problem among such bidders is thus given by 
the following definition. 
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Definition 11.4 The allocation problem among single-minded bidders is the 
following: 

INPUT: (S*, v;*) for each bidder i = 1,...,n. 

OUTPUT: A subset of winning bids W C {1,..., m} such that for every i # j € 
WwW, Sk Ss; = @ (i.e., the winners are compatible with each other) with maximum 
social welfare >; -w v}. 


This problem is a “weighted-packing” problem and is NP-complete, which we will 
show by reduction from the INDEPENDENT-SET problem. 


Proposition 11.5 The allocation problem among single-minded bidders is NP- 
hard. More precisely, the decision problem of whether the optimal allocation 
has social welfare of at least k (where k is an additional part of the input) is 
NP-complete. 


PROOF We will make a reduction from the NP-complete “INDEPENDENT- 
SET” problem: given an undirected graph G = (V, E) and a number k, does G 
have an independent set of size k? An independent set is a subset of the vertices 
that have no edge between any two of them. Given such an INDEPENDENT-SET 
instance, we will build an allocation problem from it as follows: 


¢ The set of items will be E, the set of edges in the graph. 

¢ We will have a player for each vertex in the graph. For vertex i € V we will have 
the desired bundle of i be the set of adjacent edges S* = {e € Eji € e}, and the 
value be v; = 1. 


Now notice that a set W of winners in the allocation problem satisfies SM S; = 9 
for every i 4 j € W if and only if the set of vertices corresponding to W is an 
independent set in the original graph G. The social welfare obtained by W is 
exactly the size of this set, i.e., the size of the independent set. It follows that an 
independent set of size at least k exists if and only if the social welfare of the 
optimal allocation is at least k. This concludes the NP-hardness proof. The fact 
that the problem (of whether the optimal allocation has social welfare at least k) 
is in NP is trivial as the optimal allocation can be guessed and then the social 
welfare can be calculated routinely. 


As usual when a computational problem is shown to be NP-complete, there are 
three approaches for the next step: approximation, special cases, and heuristics. We 
will discuss each in turn. 

First, we may attempt finding an allocation that is approximately optimal. Formally, 


we say that an allocation S,, ..., 5S, is a c-approximation of the optimal one if for every 
other allocation 7), ..., J, (and specifically for the socially optimal one), we have that 
Di ¥i(T) 


Faw <¢ Perhaps a computationally efficient algorithm will always be able to find 
an approximately optimal allocation? Unfortunately, the NP-completeness reduction 
above also shows that this will not be possible. Not only is it known that the finding 
the maximum independent set is NP-complete, but it is known that approximating it to 
within a factor of n!~€ (for any fixed € > 0) is NP-complete. Since in our reduction the 
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social welfare was exactly equal to the independent-set size, we get the same hardness 
here. Often this is stated as a function of the number of items m rather than the number 
of players n. Since m < n? (m is the number of edges, n is the number of vertices), we 
get: 


Proposition 11.6 Approximating the optimal allocation among single-minded 
bidders to within a factor better than m'/?~€ is NP-hard. 


As we will see in the next subsection, this level of approximation can be reached in 
polynomial time, even in an incentive-compatible way (which is the topic of the next 
subsection). 

Second, we can focus on special cases that can be solved efficiently. Several such 
cases are known. The first one is when each bidder desires a bundle of at most two items 
|S*| <2. This case is seen to be an instance of the weighted matching problem (in 
general nonbipartite graphs) which is known to be efficiently solvable. The second case 
is the “linear order” case. Assume that the items are arranged in a linear order and each 
desired bundle is for a continuous segment of items, i.e.,each S* = {j’, j) +1,...,k'} 
for some 1 < j' < k' < m (think of the items as lots along the sea shore, and assume 
that each bidder wants a connected strip of seashore). It turns out that this case can be 
solved efficiently using dynamic programming, which we leave as an exercise to the 
reader (see Exercise 11.1). 

Third, an NP-completeness result only says that one cannot write an algorithm that is 
guaranteed to run in polynomial time and obtain optimal outputs on all input instances. 
It may be possible to have algorithms that run reasonably fast and produce optimal (or 
near-optimal) results on most natural input instances. Indeed, it seems to be the case 
here: the allocation problem can be stated as an “integer programming” problem, and 
then the large number of known heuristics for solving integer programs can be applied. 
In particular, many of these heuristics rely on the linear programming relaxation of the 
problem, which we will study in Section 11.3 in a general setting. It is probably safe 
to say that most allocation problems with up to hundreds of items can be practically 
solved optimally, and that even problems with thousands or tens of thousands of items 
can be practically approximately solved quite well. 


11.2.2 An Incentive-Compatible Approximation Mechanism 


After dealing with the purely algorithmic aspect in the last subsection, we now return to 
handling also strategic issues. Again, we still avoid all representation difficulties, i.e., 
focusing on single-minded bidders. That is, we now wish to take into account the fact 
that the true bids are private information of the players, and not simply available to the 
algorithm. We still would like to optimize the social welfare as much as possible. The 
approach we take is the standard one of mechanism design: incentive compatibility. 
We refer the reader to Chapter 9 for background, but in general what we desire is 
an allocation algorithm and payment functions such that each player always prefers 
reporting his private information truthfully to the auctioneer rather than any potential lie. 
This would ensure that the allocation algorithm at least works with the true information. 
We also wish everything to be efficiently computable, of course. 
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Definition 11.7 Let V,,, denote the set of all single-minded bids on m items, and 
let A be the set of all allocations of the m items between n players. A mechanism 
for single-minded bidders is composed of an allocation mechanism f : (Vsm)"” — 
A and payment functions p; : (Vsn)"” > 8 for i = 1,...,2. The mechanism is 
computationally efficient if f and all p; can be computed in polynomial time. 
The mechanism is incentive compatible (in dominant strategies) if for every i, and 
every U1, ..., Un, U; © Vem, we have that v;(a) — p;(v;, v_i) = v;(a’) — pi(v;, v_i), 
where a = f(v;, v_;), a’ = f(v;, v_;) and v;(a) = v; if i wins in a and zero 
otherwise. 


The main difficulty here is the clash between the requirements of incentive com- 
patibility and that of computational efficiency. If we leave aside the requirement of 
computational efficiency then the solution to our problem is simple: take the socially 
efficient allocation and let the payments be the VCG payments defined in Chapter 9. 
These payments essentially charge each bidder his “externality”: the amount by which 
his allocated bundle reduced the total reported value of the bundles allocated to others. 
As shown in Chapter 9, this would be incentive compatible, and would give the exactly 
optimal allocation. However, as shown above, exact optimization of the social welfare is 
computationally intractable. Thus, when we return to the requirement of computational 
efficiency, exact optimization is impossible. Now, one may attempt using “VCG-like” 
mechanisms: take the best approximation algorithm you can find for the problem — 
which can have a theoretical guarantee of no better than O(,/m) approximation but 
may be practically much better — and attempt using the same idea of charging each 
bidder his externality according to the allocation algorithm used. Unfortunately, this 
would not be incentive compatible! VCG-like payments lead to incentive compatibility 
if but only if the social welfare is exactly optimized by the allocation rule (at least over 
some subrange of allocations). 

We thus need to find another type of mechanisms — non-VCG. While in general 
settings almost no incentive compatible mechanisms are known beyond VCG, our 
single-minded setting is “almost single-dimensional” — in the since that the private 
values are composed of a single scalar and the desired bundle — and for such settings this 
is easier. Indeed, the mechanism in Figure 11.1 is computationally efficient, incentive 
compatible, and provides a ./m approximation guarantee, as good as theoretically 
possible in polynomial time. 

This mechanism greedily takes winners in an order determined by the value of the 
expression vj / «S| . This expression was taken as to optimize the approximation ratio 
obtained theoretically, but as we will see, the incentive compatibility result would apply 
to any other expression that is monotone increasing in v; and decreasing in |S*|. The 
intuition behind the choice of j for defining the payments is that this is the bidder who 
lost exactly because of i — if Bidder i had not participated in the auction, Bidder j 
would have won. 


Theorem 11.8 The greedy mechanism is efficiently computable, incentive com- 
patible, and produces a ,/m approximation of the optimal social welfare. 
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The Greedy Mechanism for Single-Minded Bidders: 


Initialization: 


e Reorder the bids such that vf /./|Sf| > v3/./|S35| >... > ve //|Sz|. 
ewW-9. 


For i=1..ndo: if S*N (Uy st) =@ then W — WU {i}. 


Output: 
Allocation: The set of winners is W. 


Payments: For each i € W, pi = vj /,/|S7|/|S7|, where j is the 
smallest index such that S?S* #0, and for all k < j,k #14, 
Sz1.S% = ) (if no such j exists then p; = 0). 


Figure 11.1. The mechanism achieves a ./m approximation for combinatorial auctions with 
single-minded bidders. 


Computational efficiency is obvious; we will show incentive compatibility and the 
approximation performance in two separate lemmas. The incentive compatibility of 
this mechanism follows directly from the following lemma. 


Lemma 11.9 A mechanism for single-minded bidders in which losers pay 0 is 

incentive compatible if and only if it satisfies the following two conditions: 

(i) Monotonicity: A bidder who wins with bid (S}, v;) keeps winning for any v; > v* 
and for any S' C S} (for any fixed settings of the other bids). 

(ii) Critical Payment: A bidder who wins pays the minimum value needed for win- 
ning: the infimum of all values v’ such that (S;, v;) still wins. 


Before we prove the lemma — or actually just the side that we need — let us just 
verify that our mechanism satisfies these two properties. Monotonicity is implied since 
increasing vu; or decreasing S** can only move bidder i up in the greedy order, making 
it easier to win. The critical payment condition is met since notice that i wins as long 
as he appears in the greedy order before j. The payment computed is exactly the value 
at which the transition between i being before and after j in the greedy order happens. 

Note that this characterization is different from the characterization given in Chapter 
9 for general single-parameter agents, since single-minded bidders are not considered 
to have a single parameter, as their private data consists of both their value and their 
desired bundle. 


PROOF We first observe that under the given conditions, a truthful bidder will 
never receive negative utility: his utility is zero while losing (losers pay zero), 
and for winning, his value must be at least the critical value, which exactly equals 
his payment. We will now show that a bidder can never improve his utility by 
reporting some bid (S’, v’) instead of his true values (S, v). If (S’, v’) is a losing bid 
or if S’ does not contain S, then clearly reporting (S, v) can only help. Therefore 
we will assume that (S’, v’) is a winning bid and that S’ D S. 
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We next show that the bidder will never be worse off by reporting (S, v’) rather 
than (S’, v’). Denote the bidder’s payment for the bid (S’, uv’) by p’, and for the bid 
(S, v’) by p. For every x < p, bidding (S, x) will lose since p is a critical value. 
By monotonicity, (S’, x) will also be a losing bid for every x < p, and therefore 
the critical value p’ is at least p. It follows that by bidding (S, v’) instead of (S’, v’) 
the bidder still wins and his payment will not increase. 

It is left to show that bidding (S, v) is no worse than the winning bid (S, v’): 
Assume first that (S, v) is a winning bid with a payment (critical value) p. As 
long as v’ is greater than 7p, the bidder still wins with the same payment, thus 
misreporting his value would not be beneficial. When v’ < Pp the bidder will lose, 
gaining zero utility, and he will not be better off. 

If (S, v) is a losing bid, v must be smaller than the corresponding critical value, 
so the payment for any winning bid (S, v’) will be greater than v, making this 
deviation nonprofitable. 


The approximation guarantee is ensured by the following lemma. 


Lemma 11.10 Let OPT be an allocation (i.e., set of winners) with maximum 
value of \0;<0 pr ¥;, and let W be the output of the algorithm, then 0; -9 py Vi < 


aS 
Jm View UF. 


PROOF For each i € W let OPT; = {j € OPT, j =i | S7 1S; #Y} be the 
set of elements in O PT that did not enter W because of i (in addition to i itself). 
Clearly OPT C U;<w OPT; and thus the lemma will follow once we prove the 
claim that for every i € W, )) copr Vj < Vmv;. 
Note that every j € OPT; appeared after 7 in the greedy order and thus v; 
ut /iS1 
J/isF1 


.Summing over all 7 € OPT;, we can now estimate 


ee) x isi (11.1) 


jeOPT; jeOPT; 


Te 


Using the Cauchy—Schwarz inequality, we can bound 


dX isis VioPnl | do sil. (11.2) 
jEOPT, je OPT; 


Every Ss; for j € OPT, intersects S*. Since O PT is an allocation, these intersec- 
tions must all be disjoint, and thus |O PT;| < |S*|. Since O PT is an allocation 


V jcorr, |Sj| < m. We thus get 7 op, ,/|S71 < /|S?|./m, and plugging into 
Inequality 11.1 gives the claim )) -gp7, Uj < V/mn;. 


11.3 Walrasian Equilibrium and the LP Relaxation 


In this section we return to discuss combinatorial auctions with general valuations, and 
we will study the linear-programming relaxation of the winner-determination problem 
in such auctions. We will also define the economic notion of a competitive equilibrium 
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with item prices (or “Walrasian equilibrium”). Although these notions appear to be 
independent at a first glance, we will describe a strong connection between them. In 
particular, we will prove that the existence of a Walrasian equilibrium is a sufficient and 
necessary condition for having an integer optimal solution for the linear programming 
relaxation (i.e., no integrality gap). One immediate conclusion is that in environments 
where Walrasian Equilibria exist, the efficient allocation can be computed in polynomial 
time. 


11.3.1 The Linear Programming Relaxation and Its Dual 


The winner determination problem in combinatorial auctions can be formulated by an 
integer program. We present the linear programming relaxation of this integer program, 
and denote it by LPR (in the integer program Constraint (11.6) would be replaced with 
“Xi,8 E {0, 1}”). 


The Linear Programming Relaxation (LPR): 


Maximize > x;,s v{(S) (11.3) 
i¢N,SCM 
s.t. s Xis <1 VieM (11.4) 
iEN, S|jeS 
>> xi <1 Vie N (11.5) 
SCM 
xi,s = 0 VieN,SCM (11.6) 


In the integer program, each variable x;,5 equals 1 if bidder i receives the bundle 
S, and zero otherwise. The objective function is therefore maximizing social welfare. 
Condition 11.4 ensures that each item is allocated to at most one bidder, and Condition 
11.5 implies that each player is allocated at most one bundle. Solutions to the linear 
program can be intuitively viewed as fractional allocations: allocations that would be 
allowed if items were divisible. While the LP has exponentially (in m) many variables, 
it still has algorithmic implications. For example, in the case of single-minded bidders 
only a single variable X;,s» for each bidder i is required, enabling direct efficient 
solution of the LP. In Section 11.5.2 we will see that, assuming reasonable access to 
the valuations, the general LP can be solved efficiently as well. 

We will also consider the dual linear program. 


The Dual Linear Programming Relaxation (DLPR) 


Minimize Youj+ >> pj (11.7) 
ieN JEM 
st. uj + >> pj = v(S) VieN, SCM (11.8) 
jes 


ui >0, pj =O VieN, jeM (11.9) 


WALRASIAN EQUILIBRIUM AND THE LP RELAXATION 277 


The usage of the notations p; and u; is intentional, since we will later see that at the 
optimal solution, these dual variables can be interpreted as the prices of the items and 
the utilities of the bidders. 


11.3.2 Walrasian Equilibrium 


A fundamental notion in economic theory is the notion of a competitive equilibrium: a 
set of prices where the market clears, i.e., the demand equals the supply. We will now 
formalize this concept, that will be generalized later in Section 11.7. 

Given a set of prices, the demand of each bidder is the bundle that maximizes her 
utility. (There may be more than one such bundle, in which case each of them is called 
a demand.) In this section we will consider a linear pricing rule, where a price per each 
item is available, and the price of each bundle is the sum of the prices of the items in 
this bundle. 


Definition 11.11 For a given bidder valuation v; and given item prices 
P1;-+-++, Pm, a bundle T is called a demand of bidder i if for every other bundle 
SC M we have that v;(S) — es pi <u(T)- ier Dis 


A Walrasian equilibrium! is a set of “market-clearing” prices where every bidder 
receives a bundle in his demand set, and unallocated items have zero prices. 


Definition 11.12 A set of nonnegative prices pj,..., ps, and an allocation 
ST,..., S$" of the items is a Walrasian equilibrium if for every player i, S* is 
a demand of bidder i at prices pj, ..., p%, and for any item j that is not allocated 
(ie., 7 ¢ U?_|S*) we have Pj = 0. 


The following result shows that Walrasian equilibria, if they exist, are econom- 
ically efficient; i.e., they necessarily obtain the optimal welfare. This is a variant 
of the classic economic result known as the First Welfare Theorem but for environ- 
ments with indivisible items. Here we actually prove a stronger statement: the welfare 
in a Walrasian equilibrium is maximal even if the items were divisible. In particular, if a 
Walrasian equilibrium exists, then the optimal solution to the linear program relaxation 
will be integral. 


Theorem 11.13 (The First Welfare Theorem) Let p{,..., p;,, and 

i>+++, 9, be a Walrasian equilibrium, then the allocation S}, ..., S* maximizes 
social welfare. Moreover, it even maximizes social welfare over all fractional 
allocations, i.e., let {Xj 5}i,s be a feasible solution to the linear programming 
relaxation. Then, ));_, vi(S*) = Dien, scm X7 svilS). 


' Walras was an economist who published in the 19th century one of the first comprehensive mathematical 
analyses of general equilibria in markets. 
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PROOF Ina Walrasian equilibrium, each bidder receives his demand. Therefore, 
for every bidder i and every bundle S, we have v;(S*) — jest Pj > v;($) — 


> jes P;- Since the fractional solution is feasible to the LPR, we have that for 
every bidder i, >> 5X 7 g < 1 (Constraint 11.5), and therefore 


vi(S)— Do P= De Xi5( iS) = oe) (11.10) 


jest SCM jes 


The theorem will follow from summing Inequality 11.10 over all bidders, and 
showing that }’;<y Do jes: P7 = Nien, scm Xi. Do jes P;- Indeed, the left-hand 


side equals )7";_, pj since ST, ... , Sf is an allocation and the prices of unallocated 


items in a Walrasian equilibrium are zero, and the right-hand side is at most 
at p;, since the coefficient of every price p; is at most | (by Constraint 11.4 
in the LPR). 


Following is a simple class of valuations for which no Walrasian equilibrium exist. 


Example 11.14 Consider two players, Alice and Bob, and two items {a, b}. 
Alice has a value of 2 for every nonempty set of items, and Bob has a value of 3 
for the whole bundle {a, b}, and 0 for any of the singletons. The optimal allocation 
will clearly allocate both items to Bob. Therefore, Alice must demand the empty 
set in any Walrasian equilibrium. Both prices will be at least 2; otherwise, Alice 
will demand a singleton. Hence, the price of the whole bundle will be at least 
4, Bob will not demand this bundle, and consequently, no Walrasian equilibrium 
exists for these players. 


To complete the picture, the next theorem shows that the existence of an integral 
optimum to the linear programming relaxation is also a sufficient condition for the 
existence of a Walrasian equilibrium. This is a variant of a classic theorem, known as 
“The Second Welfare Theorem,” that provided sufficient conditions for the existence 
of Walrasian equilibria in economies with divisible commodities. 


Theorem 11.15 (The Second Welfare Theorem) /f an integral optimal solu- 
tion exists for LPR, then a Walrasian equilibrium whose allocation is the given 
solution also exists. 


PROOF An optimal integral solution for LPR defines a feasible efficient allo- 
cation Sf, ..., 57. Consider also an optimal solution pj,..., py, uj,...,u;, to 
DLPR. We will show that S},..., S¥,p7,..., p;, is a Walrasian equilibrium. 

Complementary-slackness conditions are necessary and sufficient conditions 
for the optimality of solutions to the primal linear program and its dual. Because 
of the complementary-slackness conditions, for every player i for which x;,s» > 0 
(1.e., X;,s* = 1), we have that Constraint (11.8) is binding for the optimal dual 
solution, 1.e., 


ut = u,(S*)— > pF 


j¢S? 
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Constraint 11.8 thus also shows that for any other bundle S we get 
v(S!)— > pt = v(S)— > _ pt 
jes} jes 


Finally, the complementary-slackness conditions also imply that for every item j 
for which Constraint (11.4) is strict, 1e., Den Slices Xi,8 < 1 — which for integral 
solutions means that item j is unallocated — then necessarily p; = 0. 


The two welfare theorems show that the existence of a Walrasian equilibrium is 
equivalent to having a zero integrality gap: 


Corollary 11.16 A Walrasian equilibrium exists in a combinatorial-auction en- 
vironment if and only if the corresponding linear programming relaxation admits 
an integral optimal solution. 


11.4 Bidding Languages 


This section concerns the issue of the representation of bids in combinatorial auctions. 
Namely, we are looking for representations of valuations that will allow bidders to 
simply encode their valuation and send it to the auctioneer. The auctioneer must 
then take the valuations (bids) received from all bidders and determine the allocation. 
Following sections will consider indirect, iterative ways of transferring information to 
the auctioneer. 

Specifying a valuation in a combinatorial auction of m items requires providing a 
value for each of the possible 2” — 1 nonempty subsets. A naive representation would 
thus require 2” — 1 real numbers to represent each possible valuation. It is clear that 
this would be completely impractical for more than about two or three dozen items. 
The computational complexity can be effectively handled for much larger auctions, 
and thus the representation problem seems to be the bottleneck in practice. 

We will thus be looking for languages that allow succinct representations of val- 
uations. We will call these bidding languages reflecting their intended usage rather 
than the more precise “valuations languages.” From the outset it is clear that due to 
information theoretic reasons it will never be possible to encode all possible valua- 
tions succinctly. Our interest would thus be in succinctly representing interesting or 
important ones. 

When attempting to choose or design a bidding language, we are faced with the same 
types of trade-offs common to all language design tasks: expressiveness vs. simplicity. 
On one hand, we would like our language to express succinctly as many “naturally 
occurring” valuations as possible. On the other hand, we would like it to be as simple 
as possible, both for humans to express and for programs to work with. A well-chosen 
bidding language should aim to strike a good balance between these two goals. 

The bottom line of this section will be the identification of a simple langauge that 
is rather powerful and yet as easily handled by allocation algorithms as are the single 
minded bids studied in Section 11.2. 
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11.4.1 Elements of Representation: Atoms, OR, and XOR 


The common bidding languages construct their bids from combinations of simple 
atomic bids. The usual atoms in such schemes are the single-minded bids addressed in 
Section 11.2: (S, p) meaning an offer of p monetary units for the bundle S of items. 
Formally, the valuation represented by (S,, p) is one where v(T) = p for every T > S, 
and v(7T) = 0 for all other T. 

Intuitively, bids can be combined by simply offering them together. Still informally, 
there are two possible semantics for an offer of several bids. One considers the bids as 
totally independent, allowing any subset of them to be fulfilled, and the other considers 
them to be mutually exclusive and allows only one of them to be fulfilled. The first 
semantics is called an OR bid, and the second is called (somewhat misleadingly) a 
XOR bid. 

Take, for example, the valuations represented by “({a, b}, 3) XOR ({c, d}, 5)” and 
“({a, b}, 3) OR ({c, d}, 5).” Each of them values the bundle {a, c} at 0 (since no atomic 
bid is satisfied) and values the bundle {a,b} at 3. The difference is in the bundle 
{a, b,c, d}, which is valued at 5 by the XOR bid (according to the best atomic bid 
satisfied), but is valued at 8 by the OR bid. For another example, look at the bid 
“({a, b}, 3) OR ({a, c}, 5).” Here, the bundle {a, b, c} is valued at 5 since both atomic 
bids cannot be satisfied together. 

More formally, both OR and XOR bids are composed of a collection of pairs 
(S;, pi), where each S; is a subset of the items, and p; is the maximum price that he 
is willing to pay for that subset. For the valuation v = (S;, pj) XOR,...,XOR 
(Sk, Pe), the value of v(S) is defined to be maxjjs,cs p;. For the valuation v = 
($1, pi) OR, ..., OR (Sx, px), one must be a little careful and the value of v(S$) is 
defined to be the maximum over all possible “valid collections” W, of the value of 
2 pi, where W is a valid collection of pairs if for alli A j e W, S; NS; = @. 

It is not difficult to see that XOR bids can represent every valuation v: just XOR, the 
atomic bids (S, v(S)) for all bundles S. On the other hand, OR bids can represent only 
superadditive bids (for any two disjoint sets $, T, v(S U T) > v(S) + v(T)), since the 
atoms giving the value v(.S) are disjoint from those giving the value v(T), and they 
will be added together for v(S U 7). It is not difficult to see that all superadditive 
valuations can indeed be represented by OR bids by ORing the atomic bids (S, v(S)) 
for all bundles S. 

We will be more interested in the size of the representation, defined to be simply the 
number of atomic bids in it. The following basic types of valuations are good examples 
for the power and limitations of these two bidding languages. 


Definition 11.17 A valuation is called additive if v(S) = Vies v({j}) for all 
S. A valuation is called unit demand if v(S) = maxjes v({j}) for all S. 


An additive valuation is directly represented by an OR bid: 
({1}, pi) OR ({2}, pz) OR «+» OR ({m}, Pm) 


while a unit-demand valuation is directly represented by an XOR bid: 


({1}, pi) XOR ({2}, p2) XOR --- XOR ({m}, Pm) 
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where for each item j, p; = v({j}). Additive valuations can be represented by XOR 
bids, but this may take exponential size: atomic bids for all 2” — 1 possible bundles 
will be needed whenever p; > 0 for all j. (Since an atomic bid is required for every 
bundle S with v(S) strictly larger than that of all its strict subsets, which is the case here 
for all S.) On the other hand, nontrivial unit-demand valuations are never superadditive 
and thus cannot be represented at all by OR bids. 


11.4.2 Combinations of OR and XOR 


While both the OR and XOR bidding languages are appealing in their simplicity, 
none of them are expressive enough to succinctly represent many desirable simple 
valuations. A natural attempt is to combine the power of OR bids and XOR bids. The 
most general way to allow this general form of combinations is to define OR and XOR 
as operations on valuations. 


Definition 11.18 Let v and u be valuations, then (v XOR uw) and (v OR wu) are 
valuations and are defined as follows: 

¢ (vXOR u)(S) = max(v(S), u(S)). 

* (v OR u)(S) = maxr rcs, rnr=p V(R) + uU(T) 


Thus a general “OR/XOR formula” bid will be given by an arbitrary expres- 
sion involving the OR and XOR operations over atomic bids. For instance, the bid 
(({a, b}, 3) XOR ({c}, 2)) OR ({d}, 5) values the bundle {a, b, c} at 3, but the bundle 
{a, b, d} at 8. The following example demonstrates the added power we can get from 
such combinations just using the restricted structure of an OR of XORs of atomic bids. 


Definition 11.19 A valuation is called symmetric if v(S) depends only on |S}. 
A symmetric valuation is called downward sloping if it can be represented as 
v(S) = Do j=1.45) Pi, With pi = pr +++ = Pm = 0. 


It is easy to verify that every downward sloping valuations with p; > p2 >--- > 
Pm > Orequires XOR bids of size 2” — 1, and cannot be represented at all by OR bids. 


Lemma 11.20 OR-of-XORs bids can express any downward sloping symmetric 
valuation on m items in size m?. 


PROOF For each j = 1,...,m we will have a clause that offers p; for any 
single item. Such a clause is a simple XOR-bid, and the m different clauses are 
all connected by an OR. Since the p;’s are decreasing, we are assured that the 
first allocated item will be taken from the first clause, the second item from the 
second clause, etc. 
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11.4.3 Dummy Items 


General OR/XOR formulae seem very complicated and dealing with them algorithmi- 
cally would appear to be quite difficult. Luckily, this is not the case and a generalization 
of the langauge makes things simple again. The main idea is to allow XORs to be rep- 
resented by ORs. This is done by allowing the bidders to introduce dummy items 
into the bids. These items will have no intrinsic value to any of the participants, but 
they will be indirectly used to express XOR constraints. The idea is that an XOR bid 
(Si, pi) XOR (Sz, p2) can be represented as (S; U {d}, pi) OR (Sz U {d}, pz), where 
dis a dummy item. 


Formally, we let each bidder i have its own set of dummy items D;, which only 
he can bid on. An OR* bid by bidder 7 is an OR bid on the augmented set of items 
M U D;. The value that an OR* bid gives to a bundle S C M is the value given by 
the OR bid to S U D;. Thus, for example, for the set of items M = {a, b, c}, the OR* 
bid ({a, d}, 1) OR ({b, d}, 1) OR ({c}, 1), where d is a dummy item, is equivalent to 
({a}, I) XOR ({b}, 1) OR ({c}, 1. 

An equivalent but more appealing “user interface” is to let bidders report a set of 
atomic bids together with “constraints” that signify which bids are mutually exclusive. 
Each constraint can then be converted into a dummy item that is added to the con- 
flicting atomic bids. Despite its apparent simplicity, this language can simulate general 
OR/XOR formulae. 


Theorem 11.21 Any valuation that can be represented by OR/XOR formula of 
size s can be represented by OR* bids of size s, using at most s* dummy items. 


PROOF We prove by induction on the formula structure that a formula of size 
s can be represented by an OR* bid with s atomic bids. We then show that each 
atomic bid in the final resulting OR* bid can be modified as to not to include 
more than s dummy items in it. 

Induction: The basis of the induction is an atomic bid, which is clearly an OR* 
bid with a single atomic bid. The induction step requires handling the two separate 
cases: OR and XOR. To represent the OR of several OR* bids as a single OR* 
bid, we simply merge the set of clauses of the different OR* bids. To represent 
the XOR of several OR* bids as a single OR* bid, we introduce a new dummy 
item xsr for each pair of atomic bids (S, v) and (T, v’) that are in two different 
original OR* bids. For each bid (S, v) in any of the original OR* bids, we add to 
the generated OR* bid an atomic bid (S U {xs7|T}, v), where T ranges over all 
atomic bids in all of the other original OR* bids. 

It is clear that the inductive construction constructs an OR* bid with exactly s 
clauses in it, where s is the number of clauses in the original OR/XOR formula. 
The number of dummy items in it, however, may be large. However, we can 
remove most of these dummy items. One can see that the only significance of a 
dummy item in an OR* bid is to disallow some two (or more) atomic bids to be 
taken concurrently. Thus we may replace all the existing dummy items with at 
most (5) new dummy items, one for each pair of atomic bids that cannot be taken 
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together (according to the current set of dummy items). This dummy item will be 
added to both of the atomic bids in this pair. 


This simulation can be directly turned into a “compiler” that translates OR/KOR 
formulae into OR* bids. This has an extremely appealing implication for allocation 
algorithms: to any winner determination (allocation) algorithm, an OR* bid looks just 
like a regular OR-bid on a larger set of items. But an OR bid looks to an allocation 
algorithm just like a collection of atomic bids from different players. It follows that 
any allocation algorithm that can handle single-minded bids (i.e., atomic bids) can 
immediately also handle general valuations represented as OR* bids or as general 
OR/XOR formulae. In particular, the various heuristics mentioned in Section 11.2 can 
all be applied for general valuations represented in these languages. 


11.5 Iterative Auctions: The Query Model 


The last section presented ways of encoding valuations in bidding languages as to 
enable the bidders to directly send their valuation to the auctioneer. In this section we 
consider indirect ways of sending information about the valuation: iterative auctions. 
In these, the auction protocol repeatedly interacts with the different bidders, aiming to 
adaptively elicit enough information about the bidders’ preferences as to be able to find 
a good (optimal or close to optimal) allocation. The idea is that the adaptivity of the 
interaction with the bidders may allow pinpointing the information that is relevant to the 
current auction and not requiring full disclosure of bidders’ valuations. This may not 
only reduce the amount of information transferred and all associated complexities but 
also preserve some privacy about the valuations, only disclosing the information that is 
really required. In addition, in many real-life settings, bidders may need to exert efforts 
even for determining their own valuation (like collecting data, hiring consultants, etc.); 
such iterative mechanisms may assist the bidders with realizing their valuations by 
guiding their attention only to the data that is relevant to the mechanism. 

Such iterative auctions can be modeled by considering the bidders as “black-boxes,” 
represented by oracles, where the auctioneer repeatedly queries these oracles. In such 
models, we should specify the types of queries that are allowed by the auctioneer. 
These oracles may not be truthful, of course, and we will discuss the incentive issues in 
the final part of this section (see also Chapter 12). The auctioneer would be required to 
be computationally efficient in two senses: the number of queries made to the bidders 
and the internal computations. Efficiency would mean polynomial running time in m 
(the number of items) even though each valuation is represented by 2” numbers. The 
running time should also be polynomial in n (the number of bidders) and in the number 
of bits of precision of the real numbers involved in the valuations. 


11.5.1 Types of Queries 


Our first step is to define the types of queries that we allow our auctioneer to make 
to the bidders. Probably the most straightforward query one could imagine is where a 
bidder reports his value for a specific bundle. 
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Value query: The auctioneer presents a bundle S, the bidder reports his value v(S) for 
this bundle. 


It turns out that value queries are pretty weak and are not expressive enough in 
many settings. Another natural and widely used type of queries is the demand query, in 
which a set of prices is presented to the bidder, and the bidder responds with his most 
valuable bundle under the published prices. 

Demand query (with item prices”): The auctioneer presents a vector of item prices 
P1;+++, Dm; the bidder reports a demand bundle under these prices, i.e., some set S 
that maximizes v(S) — Yoies Di- 


How difficult it is for a bidder to answer such a demand query or a value query de- 
pends on his internal representation of his valuation. For some internal representations 
this may be computationally intractable, while for others it may be computationally 
trivial. It does seem though that in many realistic situations the bidders will not really 
have an explicit internal representation, but rather “know” their valuation only in the 
sense of being able to answer such queries. 

The first observation that we should make is that demand queries are strictly more 
powerful than value queries. 


Lemma 11.22 A value query may be simulated by mt demand queries, where t 
is the number of bits of precision in the representation of a bundle’s value. 


PROOF We first show how to answer “marginal value” queries using demand 
queries: given a bundle S and an item j ¢ S, compute the marginal value of j 
relative to S: v(S U {j}) — v(S) (the items are denoted, w.l.o.g., by 1,...,m). 
For all i € S we set p; = 0, for all i ¢ SU {j}, we set p; = 00, and then run 
a binary search on p;. The highest value p; for which the demand under these 
prices contains j is the marginal value of j relative to S. 

Once we can solve marginal value queries, any value query can be solved by 
v(S) = Djes(v(fi € Sli < j}) — vfi € Sli < jf). 


Lemma 11.23. An exponential number of value queries may be required for 
simulating a single demand query. 


The proof of Lemma 11.23 is left for Exercise 11.3. 


11.5.2 Solving the Linear Program 


Many algorithms for handling combinatorial auctions or special cases of combinatorial 
auctions start by solving the linear programming relaxation of the problem, shown 
in Section 11.3.1. A very useful and surprising property of demand queries is that 
they allow solving the linear-programming relaxation efficiently. This is surprising 
since the linear program has an exponential number of variables. The basic idea is 


? In Section 11.7 we consider more general demand queries where a price of a bundle is not necessarily the sum 
of the prices of its items. 


ITERATIVE AUCTIONS: THE QUERY MODEL 285 


to solve the dual linear program using the Ellipsoid method. The dual program has 
a polynomial number of variables, but an exponential number of constraints. The 
Ellipsoid algorithm runs in polynomial time even on such programs, provided that a 
“separation oracle” is given for the set of constraints. Surprisingly, such a separation 
oracle can be implemented by presenting a single demand query to each of the bidders. 

Consider the linear-programming relaxation (LPR) for the winner determination 
problem in combinatorial auctions, presented in Section 11.3. 


Theorem 11.24 LPR can be solved in polynomial time (inn, m, and the number 
of bits of precision t) using only demand queries with item prices.> 


PROOF Consider the dual linear program, DLPR, presented in Section 11.3 
(Equations 11.8—11.9). Notice that the dual problem has exactly n + m variables 
but an exponential number of constraints. 

Recall that a separation oracle for the Ellipsoid method, when given a possible 
solution, either confirms that it is a feasible solution, or responds with a constraint 
that is violated by the possible solution. Consider a possible solution (7, P ) 
for the dual program. We can rewrite Constraint 11.8 of the dual program as 
uj > vji(S)— >> jes Pj- Now, a demand query to bidder 7 with prices p; reveals 
exactly the set S that maximizes the RHS of the previous inequality. Thus, in order 
to check whether Ci, DP) is feasible it suffices to (1) query each bidder i for his 
demand D; under the prices p;; (2) check only the n constraints u; + )° jeD; Pi 2 
v;(D;) (where v;(D;) can be simulated using a polynomial sequence of demand 
queries as was previously observed). If none of these are violated then we are 
assured that (i, DP) is feasible; otherwise, we get a violated constraint. 

What is left to be shown is how the primal program can be solved. (Recall that 
the primal program has an exponential number of variables.) Since the Ellipsoid 
algorithm runs in polynomial time, it encounters only a polynomial number of 
constraints during its operation. Clearly, if all other constraints were removed 
from the dual program, it would still have the same solution (adding constraints 
can only decrease the space of feasible solutions). Now take the “reduced dual” 
where only the constraints encountered exist, and look at its dual. It will have the 
same solution as the original dual and hence of the original primal, but with a 
polynomial number of variables. Thus, it can be solved in polynomial time, and 
this solution clearly solves the original primal program, setting all other variables 
to zero. 


11.5.3 Approximating the Social Welfare 


The final part of this section will highlight some of the prominent algorithmic results for 
combinatorial auctions. Some of these results are obtained by solving the LP relaxation. 
Figure 11.5.2 lists state-of-the-art results for the point in time in which this chapter 


3 The solution will have a polynomial-size support (nonzero values for x;,5), and thus we will be able to describe 
it in polynomial time. 
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Class Queries Approx IC approx Lower bound 
LZ, 
m m2 
on ony vie viogm Section 1.6, [NS06] 
/m (rand) 
Value ite sgIHKDMT04] 28 [BN05a, DSO5] 
Demand  /m [BNO5a] ora m2-¢ 
/m (rand) 
[LS05, DNS06] 
SubA Value Jim Vim [DNS05] m4 
Demand 2 (rand) [Fei06] vm 2 [DNS05] 
XOS Value Vm Vm mt [D806] 
Demand 2 [DNS05] vm =z [DNS05] 
7 ; log? m (rand) 
sq (rand) [Fei06] [DNS06] 
SubM Value 2 [LLNO6] vin 2, [KLMMO05] 
Demand 2 vm ge [FV06] 
£—-10~4 (rand 
ot ene) log? m (rand) 
[FV06] 
Subs Value 1 [Ber05] 1 
Demand 1 [GS99, BM97] 1 
ade, pi -e 
kDup Demand  ™?F*? k-mF-2 [BGNo3] ET 
[BKV05, DS05] [BGNO3, DS05] 
Proc Any Inn [NS06] - log n [Nis02] 


Figure 11.2. It describes the best algorithmic results, incentives compatible approximation 
results and lower bounds which are currently known for different classes of combinatorial- 
auction valuations. All results apply for a polynomial number of queries of the specified 
type. Results without references can be trivially derived from other entries in this table. The 
word “rand” implies that the result is achieved by a randomized algorithm; otherwise, the 
results correspond to deterministic algorithms only. Results that use « hold for any € > 0. 
For the simplicity of the presentation, we ignore the constants of the asymptotic results (i-e., 
we drop the big-Oh and notations). [NSO6]: Nisan and Segal, 2006; [BNO5a]: Blumrosen 
and Nisan, 2005; [DS05]: Dobzinski and Schapira, 2005; [LSO5]: Lavi and Swamy, 2005; 
[DNSO6]: Dobzinski et al., 2006; [Fei06]: Feige, 2006; [DNSO5]: Dobzinski et al., 2005; 
[DS06]: Dobzinski and Schapira, 2006; [LLNO6]: Lehmann et al., 2006; [KLMM05]: Khot et al., 
2005; [FV06]: Feige and Vondrak, 2006; [Ber05]: Bertelsen, 2005; [GS99]: Gul and Stacchetti, 
1999; [BM97]:Bikhchandani and Mamer, 1997; [BKV05]: Briest et al., 2005; [BGNO3]: Bartal 
et al., 2003; [NisO2]: Nisan, 2002. 


was written. For each class of bidder valuations, we mention the best currently known 
polynomial-time approximation ratio, the optimal ratio that is currently achievable 
by ex-post Nash incentive-compatible mechanisms that run in polynomial time, and 
the best computational hardness result for the algorithmic problem (under standard 
computational assumptions). We also classify the results according to the queries they 
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use: unrestricted, value queries, or demand queries. In the figure, we refer the reader 
to the papers that established these results for more details. In particular, a randomized 
incentive-compatible mechanism that achieves a O(,/m)-approximation for general 
combinatorial auctions is discussed in Chapter 12. Below are the classes of valuations 
that we consider and their abbreviations: 


Gen — General (unrestricted) valuations. 

SubA — Subadditive valuations, i.e., where v(S U T) < v(S) + v(T) for all S, T. 

XOS — All valuations that can be represented by XOR-of-ORs bids with singleton 
atomic bundles (see Section 11.4). 

SubM - Submodular valuations, i.e., where for every two bundles S and T we have 
that v(S) + v(T) => v(SUT)+v0°S OT). 

Subs — (Gross-) substitutes valuations, see Definition 11.28 in Section 11.7. 

kDup — Combinatorial auctions with k duplicates of each good. Each bidder desires 
at most a single item of each good. 

Proc — Procurement auctions, where a single buyer needs to buy a set of m items 
from n suppliers. The suppliers have privately known costs for bundles of items. The 
buyer aims to minimize the total cost paid. 


It is known that Gen D SubA D} XOS D SubM 3D Subs. 


11.6 Communication Complexity 


We already saw in Section 11.2.1 that solving the optimal allocation problem is NP- 
complete even for single-minded bidders and thus certainly for more general types 
of bidders. However, as mentioned, in practice one can usually solve problems with 
thousands or tens-of-thousands of items and bids optimally of near-optimally. Will it be 
possible to do the same for general valuations using some type of queries to the bidders? 
In other words: is the problem of representing the valuations an obstacle beyond the 
computational hardness? In this section we provide an affirmative answer: even if the 
auctioneer had unlimited computational power, then eliciting sufficient information 
from the bidders as to determine the optimal allocation would require an exponential 
amount of queries to the bidders — for any query type. We present this lower bound in 
a very general model — Yao’s two-party communication complexity model — and thus 
it holds for essentially any model of iterative combinatorial auctions with any type of 
queries. Let us first introduce this model formally. 


11.6.1 The Model and Statement of Lower Bound 


The lower bound is obtained in Yao’s standard model of two-player communication 
complexity. In this model we consider two players, Alice and Bob, each holding a 
valuation function. We can restrict ourselves to the special case where the value of 
each set is either 0 or 1. Thus, the inputs are monotone functions v1, v2 : OM = 101%, 
Alice and Bob must embark on a communication protocol whose final outcome is 
the declaration of an allocation (S, S°) that maximizes v;(S) + v2(S°). The protocol 
specifies rules for exchanging bits of information, where Alice’s message at each point 
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may depend only on v, and on previous messages received from Bob, while Bob’s 
message at each point may depend only on v2 and on previous messages received from 
Alice. No computational constraints are put on Alice and Bob — only communication 
is measured. The main result shows that: 


Theorem 11.25 Every protocol that finds the optimal allocation for every pair 
of O/T valuations v1, v2 must use at least een) bits of total communication in the 
worst case. 


Note that Ge) is exponential in m.* Since Yao’s communication model is very 
powerful, the lower bound immediately applies to essentially all settings where v, and 
v2 reside in “different places.” In particular, to the case where the bidders reply to 
queries of the auctioneer (since a protocol with an auctioneer can be converted into one 
without an auctioneer, by sending all replies directly to each other and having Alice 
and Bob simulate the auctioneer’s queries) and to any larger number of bidders (since 
the 2-bidder case is a special case where all bidders but two have null valuations.) 


11.6.2 The Proof 


Fix a communication protocol that for every input valuation pair (v;, v2) finds an 
optimal allocation S, S°. We will construct a “fooling set’: a set of valuation pairs 
with the property that the communication patterns produced by the protocol must be 
different for different valuation pairs. Specifically, for every 0/1 valuation v, we define 
the dual valuation v* to be v*(S) = 1 — v(S°). Note that (4) v* is indeed a monotone 0/1 
valuation, and (ii) for every partition (S, S°), S C M, we have that v($) + v*(S*°) = 1. 


Lemma 11.26 Let v # u be arbitrary 0/1 valuations. Then, in a welfare maxi- 
mizing combinatorial auction, the sequence of bits transmitted on inputs (v, v*) 
is not identical to the sequence of bits transmitted on inputs (u, u*). 


Before we prove the lemma, let us see how the main theorem is implied. Since 
different input valuation pairs lead to different communication sequences, we see that 
the total possible number of communication sequences produced by the protocol is 
at least the number of valuation pairs (v, v*), which is exactly the number of distinct 
0/1 valuations v. The number of 0/1 valuations can be easily bounded from below by 
2bni2) by counting only valuations such that v($) = 0 for all |S| < m/2, v(S) = 1 for 
all |S| > m/2, and allowing v(S) to be either 0 or 1 for |$| = m/2; there are 4) sets 
of size m/2, so the total number of such valuations is exponential in this number. The 


protocol must thus be able to produce 2ni2) different communication sequences. Since 
these are binary sequences, at least one of the sequences must be of length at least 


(m2): 


4 More precisely, by Stirling’s formula, (m2) ~ J2/( -m)- 2”. 
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PROOF (of lemma) Assume, by way of contradiction, that the communication 
sequence on (v, v*) is the same as on (u, u*). We first show that the same commu- 
nication sequence would also be produced for (v, u*) and for (uv, v*). Consider the 
case of (v, u*); i.e., Alice has valuation v and Bob has valuation u*. Alice does not 
see u* so she behaves and communicates exactly as she would in the (v, v*) case. 
Similarly, Bob behaves as he would in the (u, u*) case. Since the communication 
sequences in the (v, v*) and the (u, u*) cases are the same, neither Alice nor Bob 
ever notices a deviation from this common sequence, and thus never deviates 
themselves. In particular, this common sequence is followed also on the (v, u*) 
case. Thus, the same allocation (S, S°) is produced by the protocol in all four 
cases: (v, v*), (u, u*), (v, u*), (u, v*). We will show that this is impossible, since 
a single allocation cannot be optimal for all four cases. 

Since u # v, we have that for some set T, v(T) 4 u(T). Without loss of 
generality, v(T) = 1 and u(T) = 0, and so v(T) + u*(T°‘) = 2. The allocation 
(S, S°) produced by the protocol must be optimal on the valuation pair (v, u*), 
thus v(S) + u*(S°) > 2. However, since (v(S) + v*(S°)) + (u(S) + u*(S°)) = 
1+ 1= 2, we get that u(S) + v*(S°) < 0. Thus (S, S*) is not an optimal al- 
location for the input pair (u, v*) — contradiction to the fact that the protocol 
produces it as the output in this case as well. 


More complex lower bounds on communication allow us to prove tight lower bounds 
for iterative auctions in various setting. The above lower bound on communication can 
be extended to even approximating the social welfare. 


Theorem 11.27 For every € > 0, approximating the social welfare in a combi- 
natorial auction to within a factor strictly smaller than min{n, m'/?~*} requires 
exponential communication. 


Note that this is tight: achieving a factor of n is always trivial (by bundling all items 
together and selling them in a simple single-item auction), and for n > ./m there exists 
an O(./m) approximation (see Figure 11.5.2). Actually, most of the lower bounds 
described in Figure 11.5.2 are communication-complexity results. 


11.7 Ascending Auctions 


This section concerns a large class of combinatorial auction designs which contains 
the vast majority of implemented or suggested ones: ascending auctions. These are a 
subclass of iterative auctions with demand queries in which the prices can only increase. 
In this class of auctions, the auctioneer publishes prices, initially set to zero (or some 
other minimum prices), and the bidders repeatedly respond to the current prices by 
bidding on their most desired bundle of goods under the current prices. The auctioneer 
then repeatedly updates the prices by increasing some of them in some manner, until 
a level of prices is reached where the auctioneer can declare an allocation. There are 
several reasons for the popularity of ascending auctions, including their intuitiveness, 
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An item-price ascending auction for substitutes valuations: 


Initialization: 
For every item 7 € M, set p; <— 0. 
For every bidder i let S$; — @. 
Repeat 
For each i, let D; be the demand of 7 at the following prices: 
p; for 7 € S; and p; + € for j ¢ Sj. 
If for all 1 S; = D;, exit the loop; 
Find a bidder i with S$; 4 D; and update: 
e For every item j € D; \ S;, set pj — pj +e 
e S,;— D; 
e For every bidder k 4 i, Sp — Sx \ Di 


Finally: Output the allocation $j, ..., Sp. 


Figure 11.3. An item-price ascending auction that ends up with a nearly optimal allocation 
when bidders’ valuations have the (gross) substitutes property. 


the fact that private information is only partially revealed, that it is clear that they will 
terminate, and that they may increase the seller’s revenue in some settings. 

We will describe auctions that belong to two families of ascending auctions. One 
family uses a simple pricing scheme (item prices), and guarantees economic efficiency 
for a restricted class of bidder valuations. The second family is socially efficient for 
every profile of valuations, but uses a more complex pricing scheme — prices for bundles 
— extending the demand queries defined in Section 11.5. 


11.7.1 Ascending Item-Price Auctions 


Figure 11.3 describes an auction that is very natural from an economic point of view: 
increase prices gradually, maintaining a tentative allocation, until no item that is ten- 
tatively held by one bidder is demanded by another. Intuitively, at this point de- 
mand equals supply and we are close to a Walrasian equilibrium discussed earlier in 
Section 11.3, which, by the first welfare theorem, is socially efficient. 

Of course, we know that a Walrasian equilibrium does not always exist in a com- 
binatorial auction, so this cannot always be true. The problem is that the auction does 
not ensure that items are not underdemanded: it may happen that an item that was 
previously demanded by a bidder is no longer so. The following class of valuations are 
those in which this cannot happen. 


Definition 11.28 A valuation v; satisfies the substitutes (or gross-substitutes) 
. . . + > > . . 
property if for every pair of item-price vectors ¢ > p (coordinate-wise com- 
parison), we have that the demand at prices q contains all items in the de- 
mand at prices p whose price remained constant. Formally, for every A € 
argmaxs{v(S) — )) js pj}, there exists D € argmaxs{v(S) — jes qj}, such 

that D D> {j € Alp; = qj}. 
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That is, the only items that could drop from the demand when prices change from 
P to ¢ are those whose price has strictly increased. The substitutes property rules out 
any form of complementarities. For example, a single-minded bidder who is willing to 
pay 10 for the complete bundle {a, b} will demand both items at prices (3, 3), but if the 
price of b is raised to 8, this bidder will no longer demand any item — contrarily to the 
requirement of a substitutes valuation. Exercise 11.6 shows that, in general, substitutes 
valuations must be submodular. It is not difficult to see that this class of valuations 
contains the classes of additive valuations, unit-demand valuations, and downward- 
sloping valuations (see Definitions 11.17 and 11.19). With such valuations, the auction 
maintains the property that every item is demanded by some bidder. The auction 
terminates when all the bidders receive their demanded bundles, and consequently, the 
auction converges to a (nearly) Walrasian equilibrium. 


Definition 11.29 An allocation S),...,5, and a prices pi,..., Pm are an 
€-Walrasian equilibrium if L); S; > {j|p; > 0} and for each i, S; is a demand 
of i at prices p; for j €¢ S; and p; + « for j ¢ Sj. 


Theorem 11.30 = For bidders with substitutes valuations, the auction described 
in Figure 11.3 ends with an €-Walrasian equilibrium. In particular, the allocation 
achieves welfare that is within ne from the optimal social welfare. 


PROOF The theorem will follow from the following key claim: 
Claim 11.31 At every stage of the auction, for every bidder i, S; © Dj.° 


First notice that this claim is certainly true at the beginning. Now let us see what 
an update step for some bidder i causes. For i itself, S; after the step is exactly 
equal to D; (note that the changes in prices of items just added to S; exactly 
matches those defining D;). For k 4 i, two changes may occur at this step: first, 
items may have been taken from S; by i, and second the prices of items outside 
of S; may have increased. The first type of change makes S, smaller while not 
affecting D;. The second type of change does not affect S;, and the substitutes 
property directly implies that the only items that can be removed from D, are 
those whose price strictly increased and are thus not in S,. 

Once we have this claim, it is directly clear that no item that was ever demanded 
by any player is ever left unallocated; i.e., J; 5; always contains all items whose 
price is strictly positive. Since the auction terminates only when all D; = S; we 
get an €-Walrasian equilibrium. The fact that an €- Walrasian equilibrium is close 
to socially optimal is obtained just as in the proof of the first welfare theorem 
(Theorem 11.13). 


Since prices are only going up, the algorithm terminates after at most m - Umax/€ 
stages, where Umax is the maximum valuation. It may also be useful to view this auction 


5 For simplicity of presentation, the algorithm assumes that D; is unique. In the general case, the claim is that 5; 
is contained in some demand bundle D,, and the auction is required to pick such a Dj;. 
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as implementing a primal-dual algorithm. The auction starts with a feasible solution to 
the dual linear program (here, zero prices), and as long as the complementary-slackness 
conditions are unsatisfied proceeds by improving the solution of the dual program (i.e., 
increasing some prices). 

Finally, we will address the strategic behavior of the bidders in such ascending 
auctions. Will strategic bidders act myopically and truthfully reveal their demand in 
these auctions? If the valuation functions have complementarities, then bidders will 
clearly have strong incentives not to report their true preferences, due to a problem 
known as the exposure problem: Bidders who bid for a complementary bundle (e.g., 
a pair of shoes), are exposed to the risk that part of the bundle (the left shoe) may be 
taken from them later, and they are left liable for the price of the rest of the bundle (the 
right shoe) that is worthless for them. 

However, even for substitutes preferences the incentive issues are not solved. The 
prices in Walrasian equilibria are not necessarily VCG prices, and therefore truthful 
bidding is not an ex-post equilibrium.® The strategic weakness of Walrasian equilibria 
is that bidders may have the incentive to demand smaller bundles of items (demand 
reduction), in order to lower their payments. The following example illustrates such a 
scenario. 


Example 11.32 Consider two items a and b and two players, Alice and Bob, 
with the following substitutes valuations: 


v(a) | v(b) | v(ab) | 
Alice | 4 | 4 4 | 
Bob | 5 | 5 10 | 


For these valuations, the auction in Figure 11.3 will terminate at the Walrasian 
equilibrium prices pz = 4, py = 4, where Bob receives both items, and earning 
him a payoff of 2. If Bob placed bids only on a during the auction, then the auction 
would stop at zero prices, allocating a to Bob and b to Alice. With this demand 
reduction, Bob improves his payoff to 5. 


11.7.2 Ascending Bundle-Price Auctions 


As we saw, not every profile of valuations has a Walrasian equilibrium. The next type 
of auction that we describe will reach an equilibrium that involves a more complex 
pricing scheme. We start by describing this extended notion of equilibrium, allowing 
personalized bundle prices — a distinct price per each possible bundle and for each 
bidder. That is, personalized bundle prices specify a price p;(S) per each bidder i and 
every bundle S. We can naturally generalize the notion of the demand of bidder i under 
such prices to argmaxs(v;(S) — pi(S)). 


© When we further restrict the class of substitutes valuations such that each bidder desires at most one item 
(“unit-demand” valuations, see Definition 11.17), then it is known that a similar auction reaches the lowest 
possible Walrasian-equilibrium prices that are also VCG prices, and hence these auctions are ex-post Nash 
incentive compatible (see Chapter 9). 
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A bundle Price auction: 
Initialization: For every player i and bundle S, let p;(S) <— 0. 


Repeat 
e Find an allocation 7),...,7, that maximizes revenue at current prices, 
ie., oy Pi(Ti) > 37, pi(¥i) for any other allocation Yj, ..., Yn. 


(Bundles with zero prices will not be allocated, i.e., p;(T;) > 0 for eve ry 7.) 
Let L be the set of losing bidders, i-e., L = {i|T; = 0}. 

For every i € L let D; be a demand bundle of i under the prices );. 

If for alli € L, D; = 0 then terminate. 

For all i € L with D; 4 Q, let p;(D:) — pi(Di) + €. 


Figure 11.4. A bundle price auction which terminates with the socially efficient allocation for 
any profile of bidders. 


Definition 11.33 Personalized bundle prices P = {p;(S)} and an allocation 

S = (S,,..., $,) are called a competitive equilibrium if: 

¢ For every bidder i, S; is a demand bundle, i.e., for any other bundle 7; C M, 
ui (Si) — pi(Si) = vi(Ti) — pi(Ti). 

¢ The allocation S maximizes seller’s revenue under the current prices, i.e., for any 
other allocation (7), ..., Tn), )-)_ pi(Si) = 7y_, pi(T). 


It is easy to see that with personalized bundle prices, competitive equilibria always 
exist: any welfare-maximizing allocation with the prices p;(S) = v;(S) gives a compet- 
itive equilibrium. This may be viewed as the Second Welfare Theorem (see Theorem 
11.15) for this setting. Even this weak notion of equilibrium, however, guarantees 
optimal social welfare: 


Proposition 11.34 = In any competitive equilibrium on S) the allocation max- 
imizes social welfare. 


PROOF Let (p, S) be acompetitive equilibrium, and consider some allocation 
T =(T\,...,T7;,). Since S; is a demand bundle under the prices Vi for every 
bidder i, we have that v;(S;) — p;(S;) => v,(7;) — p;(7;). Summing over all the 
bidders, together with )*"_, p;(S;) => >-/_, pi(Z;), we get that the welfare in the 
allocation S exceeds the welfare in T. 


Several iterative auctions are designed to end up with competitive equilibria. 
Figure 11.4 describes a typical one. At each stage the auctioneer computes a ten- 
tative allocation that maximizes his revenue at current prices — which we view as the 
current bids. All the losing bidders then “raise their bids” on their currently demanded 
bundle. When no losing bidder is willing to do so, we terminate with an approximately 
competitive equilibrium. 


Definition 11.35 A bundle S is an €-demand for a player i under the bun- 
dle prices Ti if for any other bundle T, v;(S) — p;(S) => u;(T) — p(T) — «. An 
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€-competitive equilibrium is similar to a competitive equilibrium (Definition 
11.33), except each bidder receives an €-demand under the equilibrium prices. 


Theorem 11.36 For any profile of valuations, the bundle-price auction de- 
scribed in Figure 11.4 terminates with an €-competitive equilibrium. In particular, 
the welfare obtained is within ne from the optimal social welfare. 


PROOF Ateachstep of the auction at least one price will be raised. Since a bundle 
price will clearly never exceed its value, the auction will terminate eventually 
(although this may take exponentially many steps). Since the allocation at each 
step is clearly revenue maximizing, it suffices to show that, upon termination, 
each bidder receives an €-demand. 

Losing bidders will clearly receive their demand, the empty set, since this is 
the condition of termination. A winning bidder i gets an €-demand bundle since 
the auction maintains the property that every bundle 7; with p;(7;) > 0 is an 
€-demand. To see this notice that p;(7;) > 0 implies that at some previous round 
T; was the demand of bidder 7. At that point, T; was the exact demand, and thus, 
an €-demand bundle after the price increment. Since the last time that the bidder 
demanded (the current) 7;, only prices of other bundles have increased, clearly 
maintaining the property. 

Finally, the near optimality of the social welfare in an approximate competitive 
equilibrium follows the same arguments as in Proposition 11.34. 


Notice that while the auction always terminates with a (near) optimal allocation, 
this may require exponential time in two respects: first, the number of stages may 
be exponential, and, second, each stage requires the auctioneer to solve an NP-hard 
optimization problem. Of course, we know that this is unavoidable and that, indeed, 
exponential communication and computation are required in the worst case. Variants 
of this auction may be practically faster by allowing bidders to report a collection of 
demand bundles at each stage and increase the prices of all of them (in particular, prices 
of supersets of a reported demand bundle can be, w.].o.g., maintained to be at least as 
high as that of the bundle itself.). 

The prices generated by this auction are not VCG prices and thus players are not 
strategically motivated to act myopically and truthfully report their true demand at 
each stage.’ One weak positive equilibrium property is achieved when each bidder is 
committed in advance to act according to a fixed valuation (“proxy bidding”). Then, 
the auction admits ex-post Nash equilibria, but these equilibria require the participants 
to possess considerable knowledge of the preferences of the other bidders. 

More complex variants of the auction may charge VCG prices from the bidders 
rather then the equilibrium prices obtained. While this will have the obvious advantage 
that truthful bidding will be an ex-post Nash equilibrium, it turns out that this will lose 
some nice properties possessed by the equilibrium prices reached (like resistance to 
bidder collusion and to false-name bids in some settings). 


7 When bidders have substitutes valuations (Definition 11.28); however, the auction does terminate at VCG prices. 
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11.8 Bibliographic Notes 


This chapter gives only the very basics of the theoretical treatment of combinatorial 
auctions. Much more information appears in the recently published books (Cramton 
et al., 2006; Milgrom, 2004). Information about spectrum auctions can be found, for 
example, in (FCC auctions home page; Cramton, 2002, 2006), and a nice description 
of industrial applications can be found in (Sandholm, 2006a). 

The earliest work on the computational complexity of the winner determination 
problem in combinatorial auctions is Rothkhof et al. (1998), which contains algorithms 
for various special cases. Other early work on algorithms for winner determination is 
due to Sandholm (2002),who also noted the NP-hardness of the problem and of its 
approximation. The hardness of approximation is based on the hardness of approx- 
imation of clique size of Hastad (1999), with the strong version as stated appearing 
in Zuckerman (2006). Recent surveys on winner determination algorithms appear in 
(Lehmann et al., 2006b, Muller, 2006; Sandholm, 2006b). The single-minded case was 
studied in Lehmann et al. (2002) on which Section 11.2.2 is based. Additional results 
for the single-minded case and generalizations of it can be found in Babaioff et al. 
(2005) and the references within. 

The LP formulation of the problem and the relation of its integrality gap to Walrasian 
equilibria were studied in Bikhchandani and Mamer (1997) and Bikhchandani and 
Ostroy (2002). 

Bidding languages were studied in a general and formal way in Nisan (2000) on 
which Section 11.4 is based. Dummy items were suggested in Fujishima et al. (1999). 
A detailed survey of bidding languages appears in Nisan (2006). 

A systematic study of the query model can be found in Blumrasen and Nisan (2005a). 
The fact that the linear program can be solved in polynomial time using demand queries 
appears in Nisan and Segal (2006) and Blumfosen and Nisan (2005a). Applications of 
this fact for various approximation algorithms can be found in Dobzinski et al. (2005), 
Lavi and Swamy (2000), and Feige and Vondrak (2006). Relations of the query model 
to machine-learning theory is described in Blum et al. (2004) and Lehaie and Parkes 
(2004) and the references within. 

The analysis of the communication complexity of combinatorial auctions was initi- 
ated in Nisan and Segal (2006) on which Section 11.6 is based. A more comprehensive 
treatment of this subject can be found in the survey (Segal, 2006). A detailed exposi- 
tion of the theory of communication complexity can be found in Kushilevitz and Nisan 
(1997). 

Ascending item-price combinatorial auctions for the (gross)-substitutes case were 
first suggested by Demange et al. (1986), extending their use for matching Kelso and 
Crawford (1982). These were further studied in Bikhchandani and Mamer (1997), 
Gul and Stacchetti (1999, 2000), Milgrom (2004), and Ausubel (2006). Socially- 
efficient ascending bundle-price auctions were suggested in Parkes and Ungar (2000) 
and Ausubel and Milgrom (2002), and hybrid designs that use both item- and bundle 
prices appear in Kelly and Steinberg (2000) and Cramton et al. (2006). Ausubel and 
Milgrom (2002) also discussed connections to coalitional games and their core. A 
detailed study of ascending auctions and their limitations may be found in Blumrosen 
and Nisan (2005b). A comprehensive survey can be found in Parkes (2006). 
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Exercise 11.1 is from Rothkhof et al. (1998). A proof for Exercise 11.2 can be found 
in Muller (2006). Exercise 11.3 is from Blumrosen and Nisan (2005a). Exercise 11.4 
is from Dobzinski et al. (2005). Exercise 11.5 is from Nisan (2000). Exercise 11.6 is 
from Gul and Stacchetti (1999). Exercise 11.7 is from Parkes (2001) and Blumrosen 
and Nisan (2005b). Exercise 11.8 is from Blumrosen and Nisan (2005b). The algorithm 
in exercise 11.9 is the classic one for SET-COVER by Lovasz (1975), see also Nisan 
(2002). 
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Exercises 


11.1. Consider an auction for items 1,...,m where each bidder is single minded and 
desires an interval of consecutive items, i.e., S$; = {j|ki < j </i) where 1 < k; < 
I; < m. Prove that in this case the socially efficient allocation can be determined 
in polynomial time. 


11.2 Consider combinatorial auctions for m items among n bidders, where each val- 
uation is represented simply as a vector of 2” — 1 numbers (a value for each 
subset of items). Prove that the optimal allocation can be computed in time that is 
polynomial in the input length: n(2” — 1). (An immediate conclusion is that when 
m= O(log n) then the optimal allocation can be computed in polynomial time in 1.) 
Hint: Use dynamic programming 

11.3 Show a class of valuations for bidders in combinatorial auctions for which a single 
demand query can reveal enough information for determining the optimal alloca- 
tion, but this task may require an exponential number (in the number of items) of 
value queries. (This actually proves Lemma 11.23 from Section 11.5.1.) 

Hint: Use the fact that the number of distinct bundles of size 7, out of m items, is 
exponential in m. 


11.4 Avaluation v is called subadditive if for every two bundles 5,7, v(S) + v(T) = v(S U 
T). Prove that for any « > 0, achieving a 2 — € approximation in a combinatorial 
auction with sub additive bidders requires exponential communication. 

Hint: Construct a reduction from Theorem 11.27 in Section 11.6. 


11.8 


11.9 
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The majority valuation assigns a value of 1 to any bundle of at least 3 items, and 0 
to all other bundles. Prove that representing this valuation using an OR* formula 
requires size of at least (7m). 

2 


Prove that every (gross) substitutes valuation is submodular. 


Consider an anonymous-price variant of the bundle-price ascending auctions de- 
scribed in Figure 11.4): The same ascending-price process is performed, except 
that at every stage, all bidders observe the same bundle prices {p(S)}scm. At each 
stage, the prices of bundles that are demanded by at least one losing bidder are 
raised by e. 

Show that when all the valuations are super additive such an auction terminates 
with the socially efficient allocation. (A valuation is super additive if for every two 
bundles $,T, v(S) + v(T) < v(S UT).) 

Hint: First show that if bidder / receives the bundle 7; in the optimal allocation, 
then v,(7;) = v;(J;) for every bidder /. 


Consider a pair of valuations with the following form (where 0 <a,6 <1 are 
unknown to the seller): 


v(ab) | via) | vib) | 
Alice | 2 at B | 
Bob | 2 ir al 


Prove that no item-price ascending auction can reveal enough information for 
determining the socially efficient allocation for such valuations. 


In a procurement auction with single-minded bidders, a single buyer needs to buy 
a set of m items from n possible suppliers. Each supplier / can provide a single set 
of items 5; for a privately known price v;. The buyer needs to buy all items, and 
aims to minimize the total price paid. 


(a) Prove that the following greedy algorithm finds a (1 + In m)-approximation to 

the optimal procurement: 

e Initialize R to contain all m items, and W < &. 

e Repeat until R = 0: Choose j € argmax. TRAST’ and let 
W=WU {jf} and R= R\ Sj. 

(b) Deduce an incentive-compatible polynomial-time (1+In m)-approximation 
mechanism for procurement auctions among single-minded bidders. Show 
first that the allocation scheme defined by the algorithm is monotone, and 
identify the “critical values” to be paid by the winning suppliers. 


CHAPTER 12 


Computationally Efficient 
Approximation Mechanisms 


Ron Lavi 


Abstract 


We study the integration of game theoretic and computational considerations. In particular, we study 
the design of computationally efficient and incentive compatible mechanisms, for several different 
problem domains. Issues like the dimensionality of the domain, and the goal of the algorithm designer, 
are examined by providing a technical discussion on four results: (i) approximation mechanisms 
for single-dimensional scheduling, where truthfulness reduces to a simple monotonicity condition; 
(ii) randomness as a tool to resolve the computational vs. incentives clash for Combinatorial Auctions, 
a central multidimensional domain where this clash is notable; (iii) the impossibilities of determin- 
istic dominant-strategy implementability in multidimensional domains; and (iv) alternative solution 
concepts that fit worst-case analysis, and aim to resolve the above impossibilities. 


12.1 Introduction 


Algorithms in computer science, and Mechanisms in game theory, are very close in 
nature. Both disciplines aim to implement desirable properties, drawn from “real-life” 
needs and limitations, but the resulting two sets of properties are completely different. 
A natural need is then to merge them — to simultaneously exhibit “good” game theoretic 
properties as well as “good” computational properties. The growing importance of the 
Internet as a platform for computational interactions only strengthens the motivation 
for this. 

However, this integration task poses many difficult challenges. The two disciplines 
clash and contradict in several different ways, and new understandings must be ob- 
tained to achieve this hybridization. The classic Mechanism Design literature is rich 
and contains many technical solutions when incentive issues are the key goal. Quite 
interestingly, most of these are not computationally efficient. In parallel, most existing 
algorithmic techniques, answering the computational questions at hand, do not yield 
the game theoretic needs. There seems to be a certain clash between classic algorith- 
mic techniques and classic mechanism design techniques. This raises many intriguing 
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questions: In what cases this clash is fundamental — a mathematical impossibility? 
Alternatively, can we “fix” this clash by applying new techniques? We will try to give 
a feel for these issues. 

The possibility of constructing mechanisms with desirable computational proper- 
ties turns out to be strongly related to the dimensionality of the problem domain. 
In single-dimensional domains, the requirement for game-theoretic truthfulness re- 
duces to a convenient algorithmic monotonicity condition that leaves ample flexibility 
for the algorithm designer. We demonstrate this in Section 12.2, were we study the 
construction of computationally efficient approximation mechanisms for the classic 
machine scheduling problem. Although there exists a rich literature on approximation 
algorithms for this problem domain, quite remarkably none of these classic results 
satisfy the desired game-theoretic properties. We show that when the scheduling prob- 
lem is single-dimensional, then this clash is not fundamental, and can be successfully 
resolved. 

The problem domain of job scheduling has one additional interesting aspect that 
makes it worth studying: it demonstrates a key difference between economics and 
computer science, namely the goals of algorithms vs. the goals of classic mechanisms. 
While the economics literature mainly studies welfare and/or revenue maximization, 
computational models raise the need for completely different objectives. In scheduling 
problems, a common objective is to minimize the load on the most loaded machine. As 
is usually the case, existing techniques for incentive-compatible mechanism design do 
not fit such an objective (and, on the other hand, most existing algorithmic solutions do 
not yield the desired incentives). The resolution of these clashes has led to insightful 
techniques, and the technical exploration of Section 12.2 serves as an example. 

As opposed to single-dimensional domains, mu/ti-dimensionality seems to pose 
much harder obstacles. In Chapter 9, the monotonicity conditions that characterize 
truthfulness for multidimensional domains were discussed, but it seems that these 
conditions do not translate well to algorithmic constructions. This issue will be handled 
in the rest of the chapter, and will be approached in three different ways: we will 
explore the inherent impossibilities that the required monotonicity conditions cast 
on deterministic algorithmic constructions, we will introduce randomness to solve 
these difficulties, and we will consider alternative notions to the solution concept of 
truthfulness. 

Our main example for a multidimensional domain will be the domain of combina- 
torial auctions (CAs). Chapter 11 studies CAs mostly from a computational point of 
view, and in contrast our focus is on designing computationally efficient and incentive 
compatible CAs. This demonstrates a second key difference between economics and 
computer science, namely the requirement for computational efficiency. Even if our 
goal is the classic economic goal of welfare maximization, we cannot use Vickrey— 
Clarke-Groves mechanisms (which classically implement this goal) since in many 
cases they are computationally inefficient. The domain of CAs captures exactly this 
point, and the need for computationally efficient techniques that translate algorithms to 
mechanisms is central. In Section 12.3 we will see how randomness can help. We de- 
scribe a rather general technique that uses randomness and linear programming in order 
to convert algorithms to truthful-in-expectation mechanisms. Thus we get a positive 
answer to the computational clash, by introducing randomness. 
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In Section 12.4 we return to deterministic settings and to the classic definition 
of deterministic truthfulness, and study the impossibilities associated with it. Our 
motivating question is whether the three requirements (i) deterministic truthfulness, 
(ii) computational efficiency, and (iii) nontrivial approximation guarantees, clash in a 
fundamental and well-defined way. We already know that single dimensionality does 
not exhibit such a clash, and in this section we describe the other extreme. If a domain 
has full dimensionality (in a certain formal sense, to be discussed in the section body), 
then any truthful mechanism must be VCG. It is important to remark that this result fur- 
ther emphasizes our lack of knowledge about the state of affairs for all the intermediate 
range of multidimensional domains, to which CAs and its different variants belong. 

As was motivated in previous chapters, the game-theoretic quest should start with the 
solution concept of “implementation in dominant strategies,” and indeed most of this 
chapter follows this line of thought. However, to avoid the impossibilities mentioned 
earlier, we have to deepen our understandings about the alternatives at hand. Studies 
in economics usually turn to the solution concept of Bayesian—Nash that requires 
strong distributional assumptions, namely that the input distributions are known, and, 
furthermore, that they are commonly known, and agreed upon. Such assumptions seem 
too strong for CS settings, and criticism about these assumptions have been also raised 
by economists (e.g., “Wilson’s doctrine’). We have already seen that randomization, 
and truthful-in-expectation in particular, can provide a good alternative. We conclude 
the chapter by providing an additional example, of a deterministic alternative solution 
concept, and describe a deterministic CA that uses this notion to provide nontrivial 
approximation guarantees. 

Let us mention two other types of GT-versus-CS clashes, not studied in this chap- 
ter, to complete the picture. Different models: Some CS models have a significantly 
different structure, which causes the above-mentioned clash even when traditional ob- 
jectives are considered. In online computation, for example, players arrive over time, 
a fundamentally different assumption than classic mechanism design. The difficulties 
that emerge, and the novel solutions proposed, are discussed in Chapter 16. Differ- 
ent analysis conventions: CS usually employs worst-case analysis, avoiding strong 
distributional assumptions, while in economics, the underlying distribution is usually 
assumed. This greatly affects the character of results, and the reader is referred to, e.g., 
Chapter 13 for a broader discussion. 


12.2 Single-Dimensional Domains: Job Scheduling 


As a first example for the interaction between game theory and algorithmic theory, we 
consider single-dimensional domains. Simple single-dimensional domains were intro- 
duced in Chapter 9, where every alternative is either a winning or a losing alternative 
for each player. Here we discuss a more general case. Intuitively, single dimensionality 
implies that a single parameter determines the player’s valuation vector. In Chapter 9, 
this was simply the value for winning, but less straight-forward cases also make sense: 


Scheduling related machines. In this domain, 7 jobs are to be assigned to m machines, 
where job j consumes p; time-units, and machine i has speed s;. Thus machine i 
requires p;/s; time-units to complete job j. Let J; = >> p; be the load 


J| jis assigned to 7 
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on machine 7. Our schedule aims to minimizes the term max; /;/5;, (the makespan). 
Each machine is a selfish entity, incurring a constant cost for every consumed time unit 
(and w.l.o.g. assume this cost is 1). Thus the utility of a machine from a load /; and 
a payment P; is —1;/s; — P;. The mechanism designer knows the processing times of 
the jobs and constructs a scheduling mechanism. 


Although here the set of alternatives cannot be partitioned to “wins” and “loses,” 
this is clearly a single-dimensional domain. 


Definition 12.1 (single-dimensional linear domains) A domain V; of player 
i is single-dimensional and linear if there exist nonnegative real constants (the 
“loads”) {qi,a}aca Such that, for any v; € V;, there exists c € {_ (the “cost”’) such 
that v;(a) = gia + C. 


In other words, the type of a player is simply her cost c, as disclosing it gives us the 
entire valuation vector. Note that the scheduling domain is indeed single-dimensional 
and linear: the parameter c is equal to 1/s;, and the constant gq; , for alternative a is the 
load assigned to i according to a. 

A natural symmetric definition exists for value-maximization (as opposed to cost- 
minimization) problems, where the types are nonnegative. 

We aim to design a computationally efficient approximation algorithm, that is also 
implementable. As the social goal is a certain min—max criterion, and not to minimize 
the sum of costs, we cannot use the general VCG technique. Since we have a convex 
domain, Chapter 9 tells us that we need a “weakly monotone” algorithm. But what 
exactly does this mean? Luckily, the formulation of weak monotonicity can be much 
simplified for single-dimensional domains. 

If we fix the costs c_; declared by the other players, an algorithm for a single- 
dimensional linear domain determines the load q;(c) of player i as a function of her 
reported cost c. Take two possible types c and c’, and suppose c’ > c. Then the weak 
monotonicity condition from Chapter 9 reduces to —q;(c’)(c’ — c) = —qi(c)(c' — c), 
which holds iff g;(c’) < qi(c). Hence from Chapter 9 we know that such an algorithm is 
implementable if and only if its load functions are monotone nonincreasing. Figure 12.1 
describes this, and will help us figure out the required prices for implementability. 


ax) - 


> 
x 


Figure 12.1. A monotone load curve. 
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Suppose that we charge a payment of P;(c) = So lai (x) — qj(c)]dx from player i 
if he declares a cost of c. Using Figure 12.1, we can easily verify that these prices 
lead to incentive compatibility: Suppose that player i’s true cost is c. If he reports the 
truth, his utility is the entire area below the load curve up to c. Now if he declares 
some c’ > c, his utility will decrease by exactly the area marked by A: his cost from 
the resulting load will indeed decrease to c - g;(c’), but his payment will increase to be 
the area between the line g;(c’) and the load curve. On the other hand, if the player 
will report c” < c, his utility will decrease by exactly the area marked by B, since his 
cost from the resulting load will increase to c - q;(c"). Thus these prices satisfy the 
incentive-compatibility inequalities, and in fact this is a simple direct proof for the 
sufficiency of load monotonicity for this case. 

The above prices do not satisfy individual rationality, since a player always incurs 
a negative utility if we use these prices. To overcome this, the usual exercise is to add 
a large enough constant to the prices, which in our case can be de qi(x) dx. Note that 
if we add this to the above prices we get that a player that does not receive any load 
(i.e., declares a cost of infinity) will have a zero utility, and in general the utility of a 
truthful player will be nonnegative, exactly if, ~ qi(x) dx. From all the above we get the 
following theorem. 


Theorem 12.2 An algorithm for a single-dimensional linear domain is imple- 
mentable if and only if its load functions are nonincreasing. Furthermore, if this 
is the case then charging from every player i a price 


Pi(c) = if [qi(x) — qi(c)] dx — i qi(x) dx 
0 c 
will result in an individually rational dominant strategy implementation. 


In the application to scheduling, we will construct a randomized mechanism, as well 
as a deterministic one. In the randomized case, we will employ truthfulness in expec- 
tation (see Chapter 9, Definition 9.27). One should observe that, from the discussion 
above, it follows that truthfulness in expectation is equivalent to the monotonicity of 
the expected load. 


12.2.1 A Monotone Algorithm for the Job Scheduling Problem 


Now that we understand the exact form of an implementable algorithm, we can con- 
struct one that approximates the optimal outcome. In fact, the optimum itself is imple- 
mentable, since it can satisfy weak monotonicity (see the exercises for more details), 
but the computation of the optimal outcome is NP-hard. We wish to construct effi- 
ciently computable mechanisms, and hence design a monotone and polynomial-time 
approximation algorithm. Note that we face a “classic” algorithmic problem — no 
game-theoretic issues are left for us to handle. 

Before we start, let us assume that jobs and machines are reordered so that s; > 
S. > +++ > Sand py > p2 > +--+ > Pn. For the algorithmic construction, we first need 
to estimate the optimal makespan of a given instance. 


Estimating the optimal makespan. Fix a job-index j, and some target makespan T. 
If a schedule has makespan at most 7, then it must assign any job out of 1,..., j toa 
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machine i such that T > p;/s;. Leti(j, T) = max{i | T > p;/s; }. Thus any schedule 


with makespan at most T assigns jobs 1,..., 7 to machines 1,...,i(j, 7). From space 
considerations, it immediately follows that 

Vere 1 Pk 
a aT oa (12.1) 

l=1 SI 

Now define 
; j 
T; = minmax | 7, 2t= Pk (12.2) 
: Si iat SI 


Lemma 12.3 For any job-index j, the optimal makespan is at least T;. 


PROOF Fix any T < T;. We prove that T violates 12.1, hence cannot be any 
feasible makespan, and the claim follows. Let i; be the index that determines T;. 
The left expression in the max term is increasing with 7, while the right term is 
decreasing. Thus i; is either the last i where the right term is larger than the left 
one, or the first i for which the left term is larger than the right one. We prove that 
T violates 12.1 for each case separately. 


Case 1 (Sa 1Pk > ae For i; 


jet 51 


is the min-max, we get Tj < "een 
bed} 


Since T < Tj, we have ee ae <ij;, and 


T<Tj= Zip Pk < Zip 1Pk Hence T violates 12.1, as claimed. 
J=1 SI : 


* J . . . 
Case 2 (Stet P ee< 2): 7; < ze “ since T; is the min-max, and the max for 
a 150 ei , f=1 SI . 


i; — lis received at the right. In addition, i(j, T) < i; since Tj = Pi andT < Tj. 


Thus T < T; < hei eae < LiapPe re» as we need. 
f=1 SI 1=1 


With this, we get a good lower bound estimate of the optimal makespan: 
Tip = max; Tj (12.3) 


The optimal makespan is at least 7; for any j, hence it is at least T,p. 


A fractional algorithm. We start with a fractional schedule. If machine i gets an a 
fraction of job j then the resulting load is assumed to be (@ - p;)/s;. This is of course 
not a valid schedule, and we later round it to an integral one. 


Definition 12.4 (The fractional allocation) Let j be the first job such that 
ye} Pk > Tip: 51. Assign to machine 1 jobs 1,..., 7 — 1, plus a fraction of 
j in order to equate /; = T,p - s;. Continue recursively with the unassigned frac- 
tions of jobs and with machines 2, ..., m. 
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Lemma 12.5 There is enough space to fractionally assign all jobs, and if job 
j is fractionally assigned to machine i then p;/s; < Typ. 


j 

PROOF Let i; be the index that determines 7;. Since T,p > Tj = es we 
JA S] 

can fractionally assign jobs 1, .., 7 up to machine i;. Since T; > p;/s;, we get 


the second part of the claim, and setting 7 = n gives the first part. 


Lemma 12.6 = The fractional load function is monotone. 


PROOF We show that if s; increases to s} = a - s; (for a > 1) then J; < J;. Let 
T/,, denote the new estimate of the optimal makespan. We first claim that T/, < 
a - Tip. Foran instance s/’,..., s’” such that s;’ = a - s; for all machines / we have 
that T/, = a - Tip since both terms in the max expression of T; were multiplied 
by a. Since s; < s; for all / we have that T/', < Tj',. Now, if J; = Tip - Si, ie. i 
was full, then /! < T/, +s; < Tip - s; = /;. Otherwise J; < Tip - s;, hence i is the 
last nonempty machine. Since J, > Typ, all previous machines now get at least 
the same load as before, hence machine i cannot get more load. 


We now round to an integral schedule. The natural rounding, of integrally placing 
each job on one of the machines that got some fraction of it, provides a 2-approximation, 
but violates the required monotonicity (see the exercises). We offer two types of 
rounding, a randomized rounding and a deterministic one. The former is simpler, 
and results in a better approximation ratio, but uses the weaker solution concept of 
truthfulness in expectation. The latter is slightly more involved, and uses deterministic 
truthfulness, but results in an inferior approximation ratio. 


Definition 12.7 (A randomized rounding) Choose a € [0,1] uniformly at 
random. For every job j that was fractionally assigned to i and i+ 1, if j’s 
fraction on i is at least w, assign j to i in full, otherwise assign j toi + 1. 


Theorem 12.8 The randomized scheduling algorithm is truthful in expectation, 
and obtains a 2-approx. to the optimal makespan in polynomial-time. 


PROOF Let us check the approximation first. A machine 7 may get, in addition 
to its full jobs, two more jobs. One, j, is shared with machine i — 1, and the 
other, k, is shared with machine i + 1. If j was rounded to i then 7 initially has 
at least 1 — a fraction of j, hence the additional load caused by j is at most 
a - p;. Similarly, If k was rounded to i then i initially has at least a fraction of k, 
hence the additional load caused by k is at most (1 — aw) - px. Thus the maximal 
total additional load that i gets is a - pj; + (1 — a) - py. By Lemma 12.5 we have 
that max{p;, px} < Tip and since 7, is not larger than the optimal maximal 
makespan, the approximation claim follows. 

For truthfulness, we only need that the expected load is monotone. Note that 
machine i — | gets job j with probability a, so i gets it with probability 1 — a, 
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and i gets k with probability a. So the expected load of machine 7 is exactly its 
fractional load. The claim now follows from Lemma 12.6. 


An integral deterministic algorithm. To be accurate, what follows is not exactly 
a rounding of the fractional assignment we obtained above, but a similar-in-spirit 
deterministic assignment. We set virtual speeds, where the fastest machine is set to 
be slightly faster, and the others are set to be slightly slower, we find a fractional 
assignment according to these virtual speeds, and then use the “natural” rounding of 
placing each job fully on the first machine it is fractionally assigned to. With these 
virtual speeds, the rounding that previously failed to be monotone, now succeeds: 


Definition 12.9 (A deterministic algorithm) Given the bids s),..., 5m, per- 
form: 
(i) Set new (virtual) speeds d), ..., dm, as follows. Let dj = $51, and for i > 2, let 
d; be the the closest value of the “breakpoints” ai (for i = 1, 2,...) such that 
d; < 5;. 


(ii) Compute Tp according to the virtual speeds, i.e. TL3 = T.p(d;, d_i). 

(iii) Assign jobs to machines, starting from the largest job and the fastest machine. 
Move to the next machine when the current machine, i, holds jobs with total 
processing time larger or equal to Typ - dj. 


Note that if the fastest machine changes its speed, then a// the d;’s may change. Also 
note that step 3 manages to assign all jobs, since what we are doing is exactly the 
deterministic natural rounding described above for the fractional assignment, using the 
d;’s instead of the s;’s. As we shall see, this crucial difference enables monotonicity, 
in the cost of a certain loss in the approximation. 

To exactly see the approximation loss, first note that T,p(d) < 2.57,p(s), since 
speeds are made slower by at most this factor. For the fastest machine, since s, is 
lower than dj, the actual load up to T.p(d) may be 1.67, p(d) < 47, p(s). As we may 
integrally place on machine 1 one job that is partially assigned also to machine 2, 
observe (i) that d; > 4d2, and (ii) by the fractional rules the added job has load at most 
T.p(d)d2. Thus get that the load on machine | is at most 31.67,3(d) < 57p(s). For 
any other machine, d; < s;, and so after we integrally place the one extra partial job 
the load can be at most 27, p(d)d; < 2-2.57p(s)s; = 5TLp(s)s;. Since T.p(s) lower 
bounds the optimal makespan for s the approximation follows. 

To understand why monotonicity holds, we first need few observations that easily 
follow from our knowledge on the fractional assignment. 


For anyi > land B < d;, T.p(6, d_i) < >T.(di, d_;). Consider the following mod- 
ification to the fractional assignment for (d;, d_;): machine i does not get any job, and 
each machine 1 < i’ < i gets the jobs that were previously assigned to machine 1’ + 1. 
Since i’ is faster than i’ + 1, any machine 2 < i’ <i does not cross the T,p(d;, d_;) 
limit. As for machine 1, note that it is always the case that d; > 4d, hence the new load 
on machine | is at most >Ti(di, d_j;). 
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Ifamachine i > 1 slows down then the total work assigned to the faster machines does 
not decrease, which follows immediately from the fact that T.p(d;, d_;) > Tip(d;, a_i), 
for d/ > dj. 


If the fastest machine slows down, yet remains the fastest, then its assigned work does 
not increase. Let s; = c- s, for some c < 1. Therefore all breakpoints shift by a factor 
of c. If no speed s; moves to a new breakpoint then all d’s move by a factor of c, the 
resulting 7p will therefore also move by a factor of c, meaning that machine | will 
get the same set of jobs as before. If additionally some s;’s move to a new breakpoint 
this implies that the respective d;’s decrease, and by the monotonicity of T,g it also 
decreases, which means that machine 1 will not get more work. 


Lemma 12.10 The deterministic algorithm is monotone. 


PROOF Suppose that machine i slows down from s; to s; < s;. We need to show 
that it does not get more work. Assume that the vector d has indeed changed 
because of i’s change. 

If i is the fastest machine and it remains the fastest then the above observation 
is what we need. If the fastest machine changes to i’, then we add an artificial 
breakpoint to the slowdown decrease, where i and i’’s speeds are identical, and the 
title of the “fastest machine” moves from i to i’. Note that the same threshold, 7, is 
computed when the title goes from i to i’. i’s work when it is the “fastest machine” 
is at least 85; - T, while i’s work when i’ is the fastest is at most 2550 < 83; -T, 
hence decreases. 

If i is not the fastest, but still full, then d/ < d; (since the breakpoints remain 
fixed), and therefore T,p(d!, d_i) < >Ta(dj, d_;). With s;, i's work is at least 
T - d; (where T = T,p(d;, d_;)), and with s/ its work is at most 2 - ITH =T-d, 
hence i’s load does not increase. 

Finally, note that if i’s is not full then by the third observation, since the work 
of the previous machines does not decrease, then i’s work does not increase. 


By the above arguments we immediately get the following theorem. 


Theorem 12.11 There exists a truthful deterministic mechanism for scheduling 
related machines, that approximates the makespan by a factor of 5. 


A note about price computation is in place. A polynomial-time mechanism must 
compute the prices in polynomial time. To compute the prices for both the randomized 
and the deterministic mechanisms, we need to integrate over the load function of a 
player, fixing the others’ speeds. In both cases this is a step function, with polynomial 
number of steps (when a player declares a large enough speed she will get all jobs, and 
as she decreases her speed more and more jobs will be assigned elsewhere, where the set 
of assigned jobs will decrease monotonically). Thus we can see that price computation 
is polynomial-time. 

Without the monotonicity requirement, a PTAS for related machines exists. The 
question whether one can incorporate truthfulness is still open. 
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Open Question Does there exist a truthful PTAS for related machines? 


The technical discussion of this section aims to demonstrate that, for single- 
dimensional domains, the algorithmic implications of the game-theoretic requirement 
are “manageable,” and leave ample flexibility for the algorithmic designer. Multi- 
dimensionality, on the other hand, does not exhibit this easy structure, and the rest of 
this chapter is concerned with exactly this issue. 


12.3 Multidimensional Domains: Combinatorial Auctions 


As opposed to single-dimensional domains, the monotonicity conditions that charac- 
terize implementability in multidimensional domains are far more complex (see the 
discussion in Chapter 9), hence designing implementable approximation algorithms is 
harder. As discussed in the Introduction, this chapter examines three aspects of this 
issue, and in this section we will utilize randomness to overcome the difficulties of 
implementability in multidimensional domains. We study this for the representative 
and central problem domain of Combinatorial Auctions. 

Combinatorial Auctions (CAs) are a central model with theoretical importance 
and practical relevance. It generalizes many theoretical algorithmic settings, like job 
scheduling and network routing, and is evident in many real-life situations. Chapter 11 
is exclusively devoted to CAs, providing a comprehensive discussion on the model and 
its various computational aspects. Our focus here is different: how to design CAs that 
are, simultaneously, computationally efficient and incentive-compatible. While each 
aspect is important on its own, obviously only the integration of the two provides an 
acceptable solution. 

Let us shortly restate the essentials. In a CA, we allocate m items (2) to n play- 
ers. Players value subsets of items, and v;(S) denotes i’s value of a bundle SC Q. 
Valuations additionally satisfy (1) monotonicity, i.e., v;($) < v;(T) for S C T, and (ii) 
normalization, i.e., v;(0) = O. In this section we consider the goal of maximizing the 
social welfare: find an allocation (5), ..., S;,) that maximizes }°, v;(S;). 

Since a general valuation has size exponential in n and m, the representation issue 
must be taken into account. Chapter 11 examines two models. In the bidding languages 
model, the bid of a player represents his valuation in a concise way. For this model it is 
NP-hard to approximate the social welfare within a ratio of Q(m!/?~*), for any € > 0 (if 
single-minded bids are allowed). In the query access model, the mechanism iteratively 
queries the players in the course of computation. For this model, any algorithm with 
polynomial communication cannot obtain an approximation ratio of Q(m'/?~*) for 
any € > 0. These bounds are tight, as there exists a deterministic ./m-approximation 
with polynomial computation and communication. Thus, for the general case, the 
computational status by itself is well-understood. 

The basic incentives issue is again well-understood: with VCG (which requires the 
exact optimum) we can obtain truthfulness. The two considerations therefore clash if 
we attempt to use classic techniques, and our aim is to develop a new technique that will 
combine the two desirable aspects of efficient computation and incentive compatibility. 

We describe a rather general LP-based technique to convert approximation algo- 
rithms to truthful mechanisms, by using randomization: given any algorithm to the 
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general CA problem that outputs a c-approximation to the optimal fractional social 
welfare, one can construct a randomized c-approximation mechanism that is truthful in 
expectation. Thus, the same approximation guarantee is maintained. The construction 
and proof are described in three steps. We first discuss the fractional domain, where 
we allocate fractions of items. We then show how to move back to the original do- 
main while maintaining truthfulness, by using randomization. This uses an interesting 
decomposition technique, which we then describe. 


The fractional domain. Let x;,; denote the fraction of subset S that player i receives 
in allocation x. Assume that her value for that fraction is x;,5 - v;($). The welfare 
maximization becomes an LP: 


max > x;,5-0;(S) (CA-P) 
SAD 

subject to Yo xs <1 for each player i (12.4) 
SAO 

2 »; Xi.s <1  foreachitem j (12.5) 
i S:jeS 


Xi,s = 0 Vi, SAG. 


By constraint 12.4, a player receives at most one integral subset, and constraint 12.5 
ensures that each item is not overallocated. The empty set is excluded for technical 
reasons that will become clear below. This LP is solvable in time polynomial in its size 
by using, e.g., the ellipsoid method. Its size is related to our representation assumption. 
If we assume the bidding languages model, where the LP has size polynomial in the 
size of the bid (e.g., k-minded players), then we have a polynomial-time algorithm. If 
we assume general valuations and a query-access, this LP is solvable with a polynomial 
number of demand queries (see Chapter 11). Note that, in either case, the number of 
nonzero x;,s coordinates is polynomial, since we obtain x in polynomial-time (this will 
become important below). In addition, since we obtain the optimal allocation, we can 
use VCG (see Chapter 9) to get: 


Proposition 12.12 In the fractional case, there exists a truthful optimal mech- 
anism with efficient computation and communication, for both the bidding lan- 
guages model and the query-access model. 


The transition to the integral case. The following technical lemma allows for an 
elegant transition, by using randomization. 


Definition 12.13 Algorithm A “verifies a c-integrality-gap” (for the linear pro- 
gram CA-P) if it receives as input real numbers wj;,s, and outputs an integral point 
x which is feasible for CA-P, and 


Cc: ) Wis *Xi,s =. max ) Wi,5 * Xi,s 
feasible x’s 
i,S i,S 
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Lemma 12.14 (The decomposition lemma) Suppose that A verifies a c- 
integrality-gap for CA-P (in polynomial time), and x is any feasible point of 
CA-P. Then one can decompose x/c to a convex combination of integral feasible 
points. Furthermore, this can be done in polynomial-time. 


Let {x’}iez be all integral allocations. The proof will find {A;};-z such that (i) V/ € 
Z, Ay = 0, Git) yer An = 1, and (iti) yer Aa x! = x/c. We will also need to provide 
the integrality gap verifier. But first we show how to use all this to move back to the 
integral case, while maintaining truthfulness. 


Definition 12.15 (The decomposition-based mechanism) 
(i) Compute an optimal fractional solution, x*, and VCG prices p/ (v). 
(ii) Obtain a decomposition x*/c =) je7 Ay + x!. 


(iii) With probability 4;: (i) choose allocation x!, (ii) set prices pR(v) = 
[vj(x')/v;(x*)] pf (v). 


The strategic properties of this mechanism hold whenever the expected price equals 
the fractional price over c. The specific prices chosen satisfy, in addition to that, strong 
individual rationality (i.e., truth-telling ensures a nonnegative utility, regardless of 
the randomized choice)!: VCG is individually rational, hence pi (v) < v;(x*). Thus 
pR(v) < v;(x!) for any 1 € Z. 


Lemma 12.16 The decomposition-based mechanism is truthful in expectation, 
and obtains a c-approximation to the social welfare. 


PROOF The expected social welfare of the mechanism is (1/c) >>; vj(x*), and 
since x* is the optimal fractional allocation, the approximation guarantee follows. 
For truthfulness, we first need that the expected price of a player equals her 
fractional price over c, ie., Ey, [p)] = pi (v)/c: 


Ener PRO)] = d> Ar Loie)/vi)] - p7v) 


leT 
= [pi (v)/vix*)] - Yo Ar vie") 
leT 
= [p} (v)/v;(x*)] - vi(x*/c) = pf (v)/c (12.6) 


Fix any v_; € V_;. Suppose that when i declares v;, the fractional optimum is 
x*, and when she declares v;, the fractional optimum is z*. The VCG fractional 
prices are truthful, hence 


ui(x*) — pf (v;, vi) = vi(z*) — pj (vj, va) (12.7) 


By 12.6 and by the decomposition, dividing 12.7 by c yields 


bs Ay ves] — E,,[pi(@;, va] = bs Ay nie — E,,[p2(;, v-i)] 


leT leT 


' See Chapter 9 for definitions and a discussion on randomized mechanisms. 
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The left-hand side is the expected utility for declaring v; and the right-hand side 
is the expected utility for declaring v;, and the lemma follows. 


The above analysis is for one-shot mechanisms, where a player declares his valuation 
up-front (the bidding languages model). For the query-access model, where players 
are being queried iteratively, the above analysis leads to the weaker solution concept 
of ex-post Nash: if all other players are truthful, player i will maximize his expected 
utility by being truthful. 

For example, consider the following single item auction for two players: player J 
bids first, player [7 observes I’s bid and then bids. The highest bidder wins and pays 
the second highest value. Here, truthfulness fails to be a dominant strategy. Suppose // 
chooses the strategy “if J bids above 5, I bid 20, otherwise I bid 2.” If 7’s true value is 6, 
his best response is to declare 5. However, truthfulness is an ex-post Nash equilibrium: 
if IT fixes any value and bids that, then, regardless of I/’s bid, I’s best response is the 
truth. 

In our case, if all others answer queries truthfully, the analysis carry through as 
is, and so truth-telling maximizes i’s the expected utility. The decomposition-based 
mechanism thus has truthfulness-in-expectation as an ex-post Nash equilibrium for the 
query-access model. Putting it differently, even if a player was told beforehand the 
types of the other players, he would have no incentive to deviate from truth-telling. 


The decomposition technique. We now decompose x/c = 0-7 41 -x!, for any x 
feasible to CA-P. We first write the LP P and its dual D. Let E = {(7, S)|x;,5 > O}. 
Recall that E is of polynomial size. 


1 
min Sou (P) Tne > Xi,sWi,s + Z (D) 
S.t. leI S.t. (,S)EE 
1 
ou %S Wi syeE (128) Soxiswistz<ivieZ (12.9) 
; p Cc Gi, S)EE 
dae 220 
' 4, >0 Viet w;,s unconstrained (i, S) € E. 


Constraints 12.8 of P describe the decomposition; hence, if the optimum satisfies 
ier A = 1, we are almost done. P has exponentially many variables, so we need to 
show how to solve it in polynomial time. The dual D will help. It has variables w;,s 
for each constraint 12.8 of P, so it has polynomially many variables but exponentially 
many constraints. We use the ellipsoid method to solve it, and construct a separation 
oracle using our verifier A. 


Claim 12.17 [fw, zis feasible for D then i aeE Xi,sWi,s +z < 1. Further- 
more, if this inequality is reversed, one can use A to find a violated constraint 
of D in polynomial-time. 


PROOF Suppose 1  a.syer Xi,sWi,s +z > 1. Let A receive w as input and sup- 
pose that the integral allocation that A outputs is x’. We have aseE x. sWi,s = 
1 a ser Xi,sWi,s > |1—z, where the first inequality follows since A is a 
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c-approximation to the fractional optimum, and the second inequality is the vio- 
lated inequality of the claim. Thus constraint 12.9 is violated (for x’). 


Corollary 12.18 The optimum of Dis 1, and the decomposition x /¢ = Y\j-7 hi - 
x! is polynomial-time computable. 


PROOF z=1,w;,5 =0 Vii, S) € E is feasible; hence, the optimum is at least 
1. By claim 12.17 it is at most 1. To solve P, we first solve D with the following 
separation oracle: given w, z, if i va HeEXiSWis +28 1, return the separating 
hyperplane i Vase Xi.sWi,s + Z = 1. Otherwise, find the violated constraint, 
which implies the separating hyperplane. The ellipsoid method uses polynomial 
number of constraints; thus, there is an equivalent program with only those con- 
straints. Its dual is a program that is equivalent to P but with polynomial number 
of variables. We solve that to get the decomposition. 


Verifying the integrality gap. We now construct the integrality gap verifier for CA-P. 
Recall that it receives as input weights w;,s, and outputs an integral allocation x! which 
is a c-approximation to the social welfare w.r.t. w;,s. Two requirements differentiate 
it from a “regular” c-approximation for CAs: (i) it cannot assume any structure on 
the weights w;,s (unlike CA, where we have non-negativity and monotonicity), and 
(ii) the obtained welfare must be compared to the fractional optimum (usually we care 
for the integral optimum). The first property is not a problem. 


Claim 12.19 Given a c-approximation for general CAs, A’, where the approx- 
imation is with respect to the fractional optimum, one can obtain an algorithm A 
that verifies a c-integrality-gap for the linear program CA-P, with a polynomial 
time overhead on top of A. 


PROOF Given w = {wi.s}(,s)ez, define w* by wis = max(w;,s,0), and w 
by Wi,s = mMaxrcs , ,T)cE wy (where the maximum is O if no T C S has 
(i, T) € E. w is a valid valuation, and can be succinctly represented with size 
|E|. Let O* = maxy is feasible for CA-P 2o(j,s)eg Xi,SWi,s- Feed W to A’ to get X such 
that )0) 5 Xi,sWi,s = a (since W;,5 > w;,s for every (i, S)). 

Note that it is possible that U.seE KisWi.s < dis X;.5W;,5, since (i) the left 
hand sum only considers coordinates in E and (ii) some w;,5 coordinates might 
be negative. To fix the first problem define x* as follows: for any (i, S) such that 
Xi,s = 1, set ee = 1 for T’ = arg maxrcsyi,r)cz wip (set all other coordinates 
of x* to 0). By construction, >; 5 Xis@is = 0G. X75W's- To fix the second 
problem, define x! as follows: set a s= i s if w;,5 = 0 and 0 otherwise. Clearly, 

i = + ayt vba : 
Da.syee %,sWi.s = LG.syez Xi sWj's> and x° is feasible for CA-P. 


The requirement to approximate the fractional optimum does affect generality. 
However, one can use the many algorithms that use the primal-dual method, or a 
derandomization of an LP randomized rounding. Simple combinatorial algorithms 
may also satisfy this property. In fact, the greedy algorithm from Chapter 11 for 
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single-minded players satisfies the requirement, and a natural variant verifies a 
J/2-./m integrality-gap for CA-P. 


Definition 12.20 (Greedy (revisited)) Fix {w;,s}q,s)ez as the input. Construct 
x as follows. Let (i, S) = arg maxq’syez(wi,s'/V|S"|). Set xi,5 = 1. Remove 
from E all (i’, S’) with i’ =i or S’'AS 4G. If E FG, reiterate. 


Lemma 12.21 Greedy is a (./2m)-approximation to the fractional optimum. 


PROOF Let y = {yi,s}(,s)ez be the optimal fractional allocation. For every 
player i with x;,5, = 1 (for some S;), let Y; = { (i', S) € E | yj,5 > O and (i’, S) 
was removed from E when (i, S;) was added }. We show that ew. syey, Yi'.S 
wis < (/2./m)u;,s,, which proves the claim. We first have 


Y> yiswr.s= D> ys avis 


i’, S)EY; (i, S)EY; 
Wi,S; 
= JS Dd, yes 
(Wi, S)eY; 
Wis,5; 
= Si] Dd Ji's De, yir,s + |S| (12.10) 
: (i, SEY; (i, SEY; 


The first inequality follows since (i, S;) was chosen by greedy when (i’, S) was 
in E, and the second inequality is a simple algebraic fact. We also have: 


Yowse > Do west Do vss >5141< 185141 (211 
Gi’, S)EY; JESi WEY, jES (i, S)EY; JES: 


where the first inequality holds since every (i’, S) € Y; has either $N S$; 4 @ or 
i’ = I, and the second inequality follows from the feasibility constraints of CA-P, 


and, 
S- oves ISIS >> SO ys <m (12.12) 


(i, S)eY; jEQ (W,S)EY;, jes 
Combining 12.10, 12.11, and 12.12, we get what we need: 


Wi,S, 
yit, < HK. /|Si|+1-Jm < V2-Jm- wis, 
a V1Si| 


(i’, S)EY; 


Greedy is not truthful, but with the decomposition-based mechanism, we use 
randomness in order to “plug-in” truthfulness. We get the following theorem. 


Theorem 12.22. The decomposition-based mechanism with Greedy as the 
integrality-gap verifier is individually rational and truthful-in-expectation, and 
obtains an approximation of 2 - ./m to the social welfare. 


Remarks. The decomposition-based technique is quite general, and can be used in 
other cases, if an integrality-gap verifier exists for the LP formulation of the problem. 


316 COMPUTATIONALLY EFFICIENT APPROXIMATION MECHANISMS 


Perhaps the most notable case is multiunit CAs, where there exist B copies of each 
item, and any player desires at most one copy from each item. In this case, one can 
verify a O(m¥) integrality gap, and this is the best possible in polynomial time. To 
date, the decomposition-based mechanism is the only truthful mechanism with this 
tight guarantee. 

Nevertheless, this method is not completely general, as VCG is. One drawback is for 
special cases of CAs, where low approximation ratios exist, but the integrality gap of 
the LP remains the same. For example, with sub-modular valuations, the integrality gap 
of CA-P is the same (the constraints do not change), but lower-than-2 approximations 
exist. To date, no truthful mechanism with constant approximation guarantees is 
known for this case. One could, in principle, construct a different LP formulation for 
this case, with a smaller integrality gap, but these attempts were unsuccessful so far. 

While truthfulness-in-expectation is a natural modification of (deterministic) 
truthfulness, and although this notion indeed continues to be a worst-case notion, still 
it is inferior to truthfulness. Players are assumed to only care about their expected 
utility, and not about the variance, for example. A stronger notion is that of “universal 
truthfulness,” were players maximize their utility for every coin toss. But even this is 
still weaker. While in classic algorithmic settings one can use the law of large numbers 
to approach the expected performance, in mechanism design one cannot repeat 
the execution and choose the best outcome as this affects the strategic properties. 
Deterministic mechanisms are still a better choice. 


12.3.1 A General Overview of Truthful Combinatorial Auctions 


The search for truthful CAs is an active field of research. Roughly speaking, two 
techniques have proved useful for constructing truthful CAs. In “Maximal-in-Range” 
mechanisms, the range of possible allocations is restricted, and the optimal-in-this- 
range allocation is chosen. This achieves deterministic truthfulness with an O(,/m)- 
approximation for subadditive valuations (Dobzinski et al., 2005), an O( item)” 
approximation for general valuations (Holzman et al., 2004), and a 2-approximation. 
when all items are identical (“multi-unit auctions’) (Dobzinski and Nisan, 2006). A 
second technique is to partition the set of players, sample statistics from one set, and use 
it to obtain a good approximation for the other. See Chapter 13 for details. This tech- 
nique obtains an O(,/m)-approximation. for general valuations, and an O(log” m) for 
XOS valuations (Dobzinski et al., 2006). The truthfulness here is “universal,” i.e., for 
any coin toss — a stronger notion than truthfulness in expectation. Bartal et al. (2003) 
use a similar idea to obtain a truthful and deterministic O(B -m F )-approximation for 
multiunit CAs with B > 3 copies of each item. For special cases of CAs, these tech- 
niques do not yet manage to obtain constant-factor truthful approximations (Dobzinski 
and Nisan, 2006 prove this impossibility for Maximal-In-Range mechanisms). Due to 
the importance of constant-factor approximations, explaining this gap is challenging: 


Open Question Does there exist truthful constant-factor approximations for special 
cases of CAs that are NP-hard and yet constant algorithmic approximations are known? 
For example, does there exist a truthful constant-factor approximation for CAs with 
submodular valuations? 
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For general valuations, the above shows a significant gap in the power of randomized vs. 
deterministic techniques. It is not known if this gap is essential. A possible argument for 
this gap is that, for general valuations, every deterministic mechanism is VCG-based, 
and these have no power. Lavi et al. (2003) have initiated an investigation for the first 
part of the argument, obtaining only partial results. Dobzinski and Nisan (2006) have 
studied the other part of the argument, again with only partial results. 


Open Question What are the limitations of deterministic truthful CAs? Does ap- 
proximation and dominant-strategies clash in some fundamental and well-defined way 
for CAs? 


This section was devoted to welfare maximization. Revenue maximization is another 
important goal for CA design. The mechanism of Bartal et al. (2003) obtains the same 
guarantees with respect to the optimal revenue. More tight results for multi-unit auctions 
with budget constrained players are given by Borgs et al. (2005), and for unlimited- 
supply CAs by Balcan et al. (2005). It should be noted that these are preliminary 
results for special cases; this issue is still quite unexplored. 


12.4 Impossibilities of Dominant Strategy Implementability 


In the previous sections we saw an interesting contrast between deterministic and 
randomized truthfulness, where the key difference seems to be the dimensionality of 
the domain. We now ask whether the source of this difficulty can be rigorously identified 
and characterized. What exactly do we mean by an “impossibility,” especially since we 
know that VCG mechanisms are possible, in every domain? Well, we mean that nothing 
besides VCG is possible. Such a situation should be viewed as an impossibility, since 
(i) many times VCG is computationally intractable (as we saw for CAs), and (ii) many 
times we seek goals different from welfare maximization (as we saw for scheduling 
domains). The monotonicity characterizations of Chapter 9 almost readily provide few 
easy impossibilities for some special domains (see the exercises at the end of this 
chapter), and in this section we will study a more fundamental case. 

To formalize our exact question, it will be convenient to use the abstract social choice 
setting introduced in Chapter 9: there is a finite set A of alternatives, and each player 
has a type (valuation function) v: A — that assigns a real number to every possible 
alternative. v;(a) should be interpreted as i’s value for alternative a. The valuation 
function v;(-) belongs to the domain V; of all possible valuation functions. Our goal is 
to implement in dominant strategies the social choice function f: Vj x --- x V, > A 
(where w.l.o.g. assume that f: V — A is onto A). From chapter 9 we know that VCG 
implements welfare maximization, for any domain, and that affine maximizers are also 
always implementable. 


Definition 12.23 (Affine maximizer) /f is an “affine maximizer” if there exist 
weights k;,...,k, and {C,},¢,4 such that, for all v € V, 


f(v) € argmax,.4 (LjLkivi(x) + Cy}. 
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The fundamental question is what other function forms are implementable. This 
question has remained mostly unexplored, with few exceptions. In particular, if the 
domain is unrestricted, the answer is sharp. 


Theorem 12.24 Suppose |A| > 3 and V; = 4 for alli. Then f is dominant- 
strategy implementable iff it is an affine maximizer. 


We will prove here a slightly easier version of the sufficiency direction. The proof 
is simplified by adding an extra requirement, but the essential structure is kept. The 
exercises give guidelines to complete the full proof. 


Definition 12.25 (Neutrality) f is neutral if for all v € V, if there exists an 
alternative x such that v;(x) > v;(y), for alli and y # x, then f(v) = x. 


Neutrality essentially implies that if a function is indeed an affine maximizer then the 
additive constants C, are all zero. 


Theorem 12.26 Suppose |A| > 3 and for every i, V; = 4. If f is dominant- 
strategy implementable and neutral then it must be an affine maximizer. 


For the proof, we start with two monotonicity conditions. Recall that Chapter 9 
portrayed the strong connection between implementability and certain monotonicity 
properties. The monotonicity conditions that we consider here are stronger, and are not 
necessary for all domains. However, for an unrestricted domain, their importance will 
soon become clear. 


Definition 12.27 (Positive association of differences (PAD)) ff satisfies PAD 
if the following holds for any v, v’ € V. Suppose f(v) = x, and for any y # x, 
and any i, v}(x) — u;(x) > u;(y) — u(y). Then f(v’) = x. 


Claim 12.28 Any implementable function f , on any domain, satisfies PAD. 


PROOF Let v! = (v},..-,U;, Vi-1,+--, Un), Le., players up to i declare accord- 
ing to v’; the rest declare according to v. Thus v? = v, v" =v’, and f(v°) =x. 
Suppose f(v'~!) = x for some 1 <i <n. For every alternative y 4 x we have 
vi(y) — vi '(y) < vi (x) — uo), and in addition Us, = iy Thus, W-MON 
implies that f(v') = x. By induction, f(v") = x. 


In an unrestricted domain, weak monotonicity can be generalized as follows. 


Definition 12.29 (Generalized-WMON) For every v, v' € V with f(v) =x 
and f(v’) = y there exists a player i such that v;(y) — uj(y) > vj(x) — v(x). 


With weak monotonicity, we fix a player and fix the declarations of the others. Here, 
this qualifier is dropped. Another way of looking at this property is the following: If 
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f(v) =x and v(x) — v(x) > v’(y) — v(y) then f(v’) ¥ y (a word about notation: for 
a, B € RK”, we use a > f to denote that Vi, a; > f;). 


Claim 12.30 = If the domain is unrestricted and f is implementable then f 
satisfies Generalized-WMON. 


PROOF Fix any v, v’. We show that if f(v’) = x and v’(y) — v(y) > v(x) — 
u(x) for some y € A then f(v) ¥ y. By contradiction, suppose that f(v) = y. 
Fix A € 9%” such that v’(x) — v'(y) = v(x) — v(y) — A, and define v”: 


min{v;(z), uz) + ue) =vj@)}— Ay 2 x,y 
Vi, zEAs VIZ)= vis) — 5 c= x 
vi(y) zZ=y. 


By PAD, the transition v — v” implies f(v”) = y, and the transition v' > v” 
implies f(v”) = x, a contradiction. 


We now get to the main construction. For any x, y € A, define: 
P(x, y)={a Em" | dave V: v(x) — vy) =a, fv) =x}. (12.13) 


Looking at differences helps since we need to show that >>, kj[v;(x) — v;(y)] = Cy — 
C, if f(v) = x. Note that P(x, y) is not empty (by assumption there exists v € V with 
f(v) = x), and that if € P(x, y) then for any 6 € WR", (ie., 6 > 0), a +6e€ P(x, y): 
take v with f(v) = x and v(x) — v(y) = a, and construct v’ by increasing v(x) by 4, 
and setting the other coordinates as in v. By PAD f(v’) = x, and v(x) — v'(y) =a +6. 


Claim 12.31 For anya,e € 2", € > 0:(i)a —€ € P(x, y) > —a ¢ Ply, x), 
and (ii) a ¢ P(x, y) > —a@ € P(y,x). 


PROOF (i) Suppose by contradiction that —a € P(y, x). Therefore there exists 
v € V with v(y) — v(x) = —a@ and f(v)= y. As a —€ € P(x, y), there also 
exists v’ € V with v(x) — v'(y) = a — € and f(v’) = x. But since v(x) — v(y) = 
a > v(x) — v’(y), this contradicts Generalized-WMON. (ii) For any z 4 x, y 
take some 6, € P(x, z) and fix somee > 0. Fix some v such that v(x) — u(y) =a 
and v(x) — v(z) = B, + € forall z 4 x, y. By the above argument, f(v) € {x, y}. 
Since v(x) — u(y) = a@ ¢ P(x, y) it follows that f(v) = y. Thus —@ = v(y) — 
u(x) € P(y, x), as needed. 


Claim 12.32. Fix a, B,€;,€2,E HN", €; > 0, such that a — €, € P(x, y) and 
B—€2 € P(y,z). Thena + B — (€; + €2)/2 € P(x, z). 


PROOF For any w 4 x, y,z fix some 6, € P(x, w). Choose any v such that 
v(x) — u(y) =a — €;/2, v(y) — v(zZ) = B — €2/2, and v(x) — v(w) = 6, 4+ € for 
all w # x, y, z (for some € > 0). By Generalized-WMON, f(v) = x. Thus a + 
B — (€1 + €2)/2 = v(x) — v(z) € P(x, 2). 
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Claim 12.33 [fa is in the interior of P(x, y) then a is in the interior of P(x, z), 
for any Zz #x, y. 


PROOF Suppose a —€ € P(x, y) for some € > 0. By neutrality we have that 
€/4—€/8 =€/8 € Py, z). By Claim 12.32 we now get that a — €/4 € P(x, z), 
which implies that @ is in the interior of P(x, z). 


By similar arguments, we also have that if a is in the interior of P(x, z) then a 
is in the interior of P(w, z). Thus we get that for any x, y, w, z € A, not necessarily 
distinct, the interior of P(x, y) is equal to the interior of P(w, z). Denote the interior 
of P(x, y)as P. 


Claim 12.34 =P is convex. 


PROOF We show that a, 8 € P implies (a + B)/2 € P. A known fact from 
convexity theory then implies that P is convex.” By Claim 12.32, a + B € P. We 
show that for any a € P we have a/2 € P as well, which then implies the Claim. 
Suppose by contradiction that a/2 ¢ P. Thus by Claim 12.31, —a/2 € P. Then 
a/2=a+(—a/2) € P, acontradiction. 


We now conclude the proof of Theorem 12.26. Neutrality implies that 0 is on the 
boundary of any P(x, y); hence, it is not in P. Let P denote the closure of P. By the 
separation lemma, there exists a k € i” such that for any a € P, k-a > 0. Suppose 
that f(v) = x for some v € V, and fix any y 4 x. Thus v(x) — v(y) € P(x, y), and 
k - v(x) — v(y) = 0. Hence k - v(x) => k - u(y), and the theorem follows. 

We have just seen a unique example, demonstrating that there exists a domain 
for which affine maximizers are the only possibility. However, our natural focus is on 
restricted domains, as most of the computational models that we consider do have some 
structure (e.g., the two domains we have considered in this chapter). Unfortunately, 
clear-cut impossibilities for such domains are not known. 


Open Question Characterize the class of domains for which affine maximizers are 
the only implementable functions. 


Even this question does not capture the entire picture, as, for example, it is known that 
there exists an implementable but not an affine-maximizer CA. Nevertheless, there 
do seem to be some inherent difficulties in designing truthful and computationally- 
efficient CAs.* The less formal open question therefore searches for the fundamental 
issues that cause the clash. Obviously, these are related to the monotonicity conditions, 
but an exact quantification of this is still unknown. 


? For a, B € P and 0 <d <1, build a series of points that approach Aw + (1 — A), such that any point in the 
series has a ball of some fixed radius around it that fully belongs to P. 

3 See Lavi et al. (2003). 

4 Note that we have in mind deterministic CAs. 


ALTERNATIVE SOLUTION CONCEPTS 321 
12.5 Alternative Solution Concepts 


In light of the conclusions of the previous section, a natural way to advance would 
be to reexamine the solution concept that we are using. In Section 12.3 we saw that 
randomization certainly helps, but also carries with it some disadvantages. However, in 
some cases randomization is not known to help, and additionally sometimes we want to 
stick to deterministic mechanisms. What other solution concepts that fit the worst-case 
way of thinking in CS can we use? 

One simple thought is that algorithm designers do not care so much about actually 
reaching an equilibrium point — our major concern is to guarantee the optimality of the 
solution, taking into account the strategic behavior of the players. One way of doing 
this is to reach a good equilibrium point. But there is no reason why we should not 
allow the mechanism designer to “leave in” several acceptable strategic choices for the 
players, and to require the approximation to be achieved in each of these choices. 

As a first attempt, one is tempted to simply let the players try and improve the 
basic result by allowing them to lie. However, this can cause unexpected dynamics, as 
each player chooses her lies under some assumptions about the lies of the others, etc. 
etc. We wish to avoid such an unpredictable situation, and we insist on using rigorous 
game theoretic reasoning to explain exactly why the outcome will be satisfactory. The 
following definition captures the initial intuition, without falling to such pitfalls: 


Definition 12.35 (Algorithmic implementation) A mechanism M is an algo- 
rithmic implementation of a c-approximation (in undominated strategies) if there 
exists a set of strategies, D, such that (1) M obtains a c-approximation for any 
combination of strategies from D, in polynomial time, and (11) for any strategy 
not in D, there exists a strategy in D that weakly dominates it, and this transition 
is polynomial-time computable. 


The important ingredients of a dominant-strategies implementation are here: the 
only assumption is that a player is willing to replace any chosen strategy with a 
strategy that dominates it. Indeed, this guarantees at least the same utility, even in 
the worst case, and by definition can be done in polynomial time. In addition, again 
as in dominant-strategy implementability, this notion does not require any form of 
coordination among the players (unlike Nash equilibrium), or that players have any 
assumptions on the rationality of the others (as in “iterative deletion of dominated 
strategies”). 

However, two differences from dominant-strategies implementation are worth men- 
tioning: (1) A player might regret his chosen strategy, realizing in retrospect that 
another strategy from D would have performed better, and (II) deciding how to play 
is not straight-forward. While a player will not end up playing a strategy that does not 
belong to D, it is not clear how he will choose one of the strategies of D. This may 
depend, for example, on the player’s own beliefs about the other players, or on the 
computational power of the player. 

Another remark, about the connection to the notion of implementation in undomi- 
nated strategies, is in place. The definition of D does not imply that all undominated 
strategies belong to D, but rather that for every undominated strategy, there is an 
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equivalent strategy inside D (i.e., a strategy that yields the same utility, no matter 
what the others play). The same problem occurs with dominant-strategy implementa- 
tions, e.g., WCG, where it is not required that truthfulness should be the on/y dominant 
strategy, just a dominant strategy. 

In this section we illustrate how to use such a solution concept to design CAs for 
a special class of “single-value” players. The resulting auction has another interesting 
feature: while most mechanisms we have seen so far are direct revelation, in practice 
indirect mechanisms, and especially ascending auctions (players compete by raising 
prices and winners pay their last bid) are much preferred. The following result is an 
attempt to handle this issue as well. 


Single-value players. The mechanisms of this section fit the special case of players 
that desire several different bundles, all for the same value: Player i is single-valued 
if there exists 0; > 1 such that for any bundle s, v;(s) € {0, v;}. That is, i desires any 
one bundle out of a collection S; of bundles, for a value v;. We denote such a player 
by (0;, S;). 0; and S; are private information of the player. Since S; may be of size 
exponential in m, we assume the query access model, as detailed below. 


An iterative wrapper. We start with a wrapper to a given algorithmic subprocedure, 
which will eventually convert algorithms to a mechanism, with a small approximation 
loss. It operates in iterations, with iteration index j, and maintains the tentative winners 
W,, the sure-losers L ;, and a “tentative winning bundle” s; for every i. In each iteration, 
the subprocedure is invoked to update the set of winners to Wj; and the winning 
bundles to s/*!. Every active nonwinner then chooses to double his bid (v/ ) or to 
permanently retire. This is iterated until all nonwinners retire. 


Definition 12.36 (The wrapper) Initialize j; = 0, W; = L; = Y, and for every 
player i, ve = | and s? = Q. While W; UL; 4 “all players” perform: 


1. (Wj41, s/*') — PROC(v!, s/, W;). 

2. Wi ¢ Wj+41 U L;, i chooses whether to double his value (v 
permanently retire (v/ *! < 0), For all others set v! le 
3. Update Lj.; = {i € N | vit = 0} and j — j + 1, and reiterate. 


jt+1 


i 


<2-v/) orto 


<—v}. 


Outcome: Let J = j (total number of iterations). Every i € W, gets af and pays 
v! . All others lose (get nothing, pay 0). 


For feasibility, PROC must maintain: Vi, i’ © Wj+1, s} pag ce =. 

We need to analyze the strategic choices of the players, and the approximation loss 
(relative to PROC). This will be done gradually. We first worry about minimizing the 
number of iterations. 


Definition 12.37 (Proper procedure) PROC is proper if (1) Pareto: Vi ¢ 
Wj+1 U Lj, 5)*' A Ulew,,.5/ *') # G, and (2) Shrinking-sets: Vi, s/*! C sj. 

In words, the pareto property implies that the set of winners that PROC outputs is 
maximal, i.e., that any loser that has not retired desires a bundle that intersects some 
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winner’s bundle. The shrinking-sets property says that a player’s new tentative bundle 
must be a subset of the old tentative bundle. — 

A “reasonable” player will not increase v/ above ¥;; otherwise, his utility will be 
nonpositive (this strategic issue is formally discussed below). Assuming this, there 
will clearly be at most n - log(vmax) iterations, where Umax = max; v;. With a proper 
procedure this bound becomes independent of n. 


Lemma 12.38 = /f every player i never increases v! above 0;, then any proper 
procedure performs at most 2 -log(Umax) + 1 iterations. 


PROOF Consider iteration j = 2 - log(vmax) + 1, and some i; ¢ Wj+; U L; that 
(by contradiction) doubles his value. By Pareto, there exists i2 € Wj: such 
that s} Hh si S # &. By “shrinking-sets,” in every j’ < j their winning bundles 
intersect, hence at least one of them was not a winner, and doubled his value. But 
then vu} > Umax, a contradiction. 


This affects the approximation guarantee, as shown below, and also implies that the 
Wrapper adds only a polynomial-time overhead to PROC. 


A warm-up analysis. To warm up and to collect basic insights, we first consider 
the case of known single-minded players (KSM), where a player desires one specific 
bundle, 5;, which is public information (she can lie only about her value). This allows 
for a simple analysis: the wrapper converts any given c-approximation. to a dominant- 
strategy mechanism with O(log(vmax) - Cc) approximation. Thus, we get a deterministic 
technique to convert algorithms to mechanisms, with a small approximation loss. 

Here, we initialize ee = §,, and set s} cae s} , which trivially satisfies the shrinking- 
sets property. In addition, pareto is satisfied w.l.o.g. since if not, add winning players in 
an arbitrary order until pareto holds. For KSM players, this takes O(n - m) time. Third, 
we need one more property: 


Definition 12.39 (Improvement) });<yy,,, vi > Diew, v!. 


This is again without loss of generality: if the winners outputted by PROC violate this, 
simply output W; as the new winners. To summarize, we use: 


Definition 12.40 (The KSM-PROC) Given a c-approximation. A for KSM 
players, KSM-PROC invokes A with s/ (the desired bundles) and v/ (player 
values). Then, it postprocesses the output to verify pareto and improvement. 


Proposition 12.41 | Under dominant strategies, i retires iff ¥;/2 < vj < Uj. 


(The simple proof is omitted.) For the approximation, the following analysis carries 
through to the single-value case. Let S;| i = {s € S; | 5 ¢ si}, and 


R,(, S) = { (u;, Si|,,)|i retired at iteration j }, (12.14) 
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ie., for every player i that retired at iteration j the set R;(v, S) contains a single-value 
player, with value v; (given as a parameter), and desired bundles S;|_; (where S; is given 
as a parameter). For the KSM case, R;(v, S) is exactly all retired players in iteration j,as 
the operator “| ;” has no effect. Hence, to prove the approximation, we need to bound the 
value of the optimal allocation to the players in R = U 44 R;(v, S). Foran instance X of 
single-value players, let OP T'(X) be the value of the optimal allocation to the players 
in X. In particular: OPT(Rj(V, S)) = MAXgu atiocations(s),..5») S.t.sreSil j {Dis 540 Ui }- 


Definition 12.42 (Local approximation) A proper procedure is a c-local- 
approximation w.r.t a strategy set D if it satisfies improvement, and, for any 
combination of strategies in D and any iteration /, 


Algorithmic approximation OPT(Rj;(v/, 8)) <c- >; ew, v! 


Value bounds v! < vj(s/ ), and, if i retires at j then v! > 0; /2. 


Claim 12.43 Given ac-approximation A for single minded players, KSM-PROC 
is a c-local-approximation for the set D of dominant strategies. 


PROOF The algorithmic approximation property follows since A_ out- 
puts a c-approximation outcome. The value bounds property is exactly 
Proposition 12.41. 


We next translate local approximation to global approximation (this is valid also for 
the single-value case). 


Claim 12.44 A c-local-approximation satisfies OPT(R) < 5+ log(vmax) : ¢- 
ba ew, vi Whenever players play strategies in D. 


PROOF By the value bounds, OPT(R;(v, S)) < 2. OPT(R;(v!, S)). We have 
(i) OPT(R,(v!, S)) <c- View, uv} by algorithmic approximation, (ii) View, 
v! < DuieWin v! us by improvement, and (iii) vu’ < 0; (by the value bounds), and 
therefore we get OPT(Rj(b, S)) < 2-¢- View, ti. Hence OPT(R) < ae 
OPT(Rj(0, S) <J-2-c: View, v;. Since J <2-log(vmax) + 1, the claim 
follows. 


For single-minded players, R is the set of losing players, hence we conclude: 


Theorem 12.45 Given any c-approximation. for KSM players, the Wrapper 
with KSM-PROC implements an O(log(Umax) + ¢) approximation. in dominant 
Strategies. 


A subprocedure for single-value players. Two assumptions are relaxed: players 
are now multiminded, and their desired bundles are unknown. Here, we define the 
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following specific subprocedure. For a set of players X, let Free(X, s/*) denote the 
items not in Ujexs/. 


Definition 12.46 (1-CA-PROC) Let M; = argmax;, wiv! }, GREEDY ; = 9. 
For every player i with v/ > 0, in descending order of values, perform: 


Shrinking the winning set: If i ¢ W; allow him to pick a bundle s/*' C 


Free(GREEDY ,, si*!) ns) such that |s/*"| < \/m. Inany other case (i € W; 
J+1 j 

i Oo Se 

Updating the current winners: If |s/ +!) < /m, add i to any of the alloca- 
tions W € {W;, M;, GREEDY |} for which Cle C Free(W, s/+!), 


or i does not pick) set s 


Output s/+! and W € {W;, Mj, GREEDY |} that maximizes )\,_ v/. 


Recall that the nonwinners then either double their value or retire, and we reiterate. 
This is the main conceptual difference from “regular” direct revelation mechanisms: 
here, the players themselves gradually determine their winning set (focusing on one 
of their desired bundles), and their price. Intuitively, it is not clear how a “reasonable” 
player should shrink his winning set, when approached. Ideally, a player should focus 
on a desired bundle that intersects few, low-value competitors. But in early iterations 
this information is not available. Thus there is no clear-cut on how to shrink the winning 
set, and the resulting mechanism does not contain a dominant strategy. This is exactly 
the point where we use the new notion of algorithmic implementation. 


Analysis. We proceed by characterizing the required set D of strategies. We say 
that player i is “loser-if-silent” at iteration j if, when asked to shrink her bundle by 
1-CA-PROC, v} > 0;/2 (retires if losing), i € W; and i ¢ M; (not a winner), and 
s} 1 (Uirew, Ty # @ and si al (Ujem, 527") 4 @ (remains a loser after pareto). In 
other words, a loser-if-silent loses (regardless of the others’ actions) unless she shrinks 
her winning set. Let D be all strategies that satisfy, in every iteration /: 


(i) v! < u(s?), and, if i retires at j then v! > 0; /2. 
(ii) If i is “loser-if-silent” then she declares a valid desired bundle 4 m , if such a bundle 
exists. 


There clearly exists a (poly-time) algorithm to find a strategy st’ € D that dominates a 
given strategy st. Hence, D satisfies the second requirement of algorithmic implemen- 
tation. It remains to show that the approximation is achieved for every combination of 
strategies from D. 


Lemma 12.47 /-CA-PROC is an O(,./m)-local-approximation w.r.t. D. 


PROOF (sketch). The pareto, improvement, and value-bounds properties are 
immediate from the definition of the procedure and the set D. The O(,/m)- 
algorithmic-approximation property follows from the following argument. We 
need to bound OPT = OPT ({(v;, S;|,i) | i retired at iteration j}) by the sum of 
values of the players in Wj. We divide the winners in OPT to four sets. Those 
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that are in Mj, GREEDY j, Wj, or in none of the above. For the first three sets 
the 1-CA-PROC explicitly verifies our need. It remains to handle players in the 
forth set. First notice that such a player is loser-if-silent. If such a player receives 
in OPT a bundle with size at least ./m we match him to the player with the highest 
value in M;. There can be at most ,/m players in OPT with bundles of size at 
least ./m, so we lose a ./m factor for these players. If a player, i, in the forth set, 
receives in OPT a bundle with size at most ./m, let s* be that bundle. Since he is 
a loser-if-silent, there exists i’ € GREEDY ; such that s} Os* A and v! < vl. 
We map i to i’. For any i, iz that were mapped to i’ we have that 57 057 = 
since both belong to OPT. Since the size of si is at most ./m it follows that at 
most ./m players can be mapped to i’, so we lose a ./m factor for these players 
as well. This completes the argument. 


In the single-value case, R does not contain all players, so we cannot repeat the 
argument from the KSM case that immediately linked local approximation and global 
approximation. However, Claim 12.44 still holds, and we use R as an intermediate set 
of “virtual” players. The link to the true players is as follows (recall that m denotes the 
number of items). 


Definition 12.48 (First-time shrink) PROC satisfies “first time shrink” if for 
any i1,i2 € {i : |s/]| =m & |sJ*"| <m},s/*' ns)" = 9. 


1-CA-PROC satisfies this since any player that shrinks his winning bundle is added to 
GREEDY ,. 


Lemma 12.49 = Given a c-local-approximation (w.r.t. D) that satisfies first-time 
shrink, the Wrapper obtains an O (log?(Umax) « ©) approximation for any profile of 
Strategies in D. 


PROOF We continue to use the notation of Claim 12.44. Let P = {(i;, S;): 
i lost, and sz | < m}. Players in P appear with all their desired bundles, while 
players in R appear with only part of their desired bundles. However, ignoring 
the extra bundles in P incurs only a bounded loss: 


Claim 12.50 OPT(P) < J- OPT(R). 


PROOF Define P; to be all players in P that first shrank their bundle at iteration 
j. By “first-time shrink,” and since winning bundles only shrink, 1 al sj = 9G 
for every ij, i2 € P;. Therefore OPT(R) > Diep, v;: every player i in P; cor- 
responds to a player in R, and all these players have disjoint bundles in R since 
the bundles of i are contained in s/. We also trivially have OPT(P;) < >>; ep; Ui: 
Thus, for any j, OPT(P;) < OPT(R), and OPT(P) < )), OPT(P;) < J- 
OPT(R). 


To prove the lemma, first notice that all true players are contained in P U 
RU Wy: all retiring players belong to R U P (if a player shrank his bundle then 
he belongs to P with all his true bundles, and if a player did not shrink his 
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bundle at all then he belongs to R with all his true bundles) and all nonretiring 
players belong to W,. From the above we have OPT(P U R) < OPT(P)+ 
OPT(R) < J- OPT(R) + OPT(R) <4: J? + c+ Vijcw, 3/7. Since s/ contain 
some desired bundle of player i, we have that OPT(W. N= 5 ew, Ui- Thus we 
get that OPT(P URU W,) <5-J?-@- Siew, 5. Since J < 2 - log(vmax) + 1 
by Lemma 12.38, the lemma follows. 


By all the above, we conclude the following. 


Theorem 12.51. The Wrapper with 1-CA-PROC is an algorithmic implementa- 
tion of an O(log?(Umax) - c)-approximation for single-value players. 


This result has demonstrated that if we are less interested in reaching an equilibrium 
point, but rather in guaranteeing a good-enough outcome, then alternative solution 
concepts, that are no worse than classic dominant strategies, can be of much help. 
However, the true power of relaxing dominant strategies to undominated strategies was 
not formally settled. 

Open Question Does there exist a domain in which a computationally efficient 
algorithmic implementation achieves a better approximation than any computationally 
efficient dominant-strategy implementation? 


12.6 Bibliographic Notes 


The connection between classic scheduling and mechanism design was suggested by 
Nisan and Ronen (2001), that studied unrelated machines and reached mainly im- 
possibilities. Archer and Tardos (2001) studied the case of related machines, and the 
monotonicity characterization of Section 12.2 is based on their work. Deterministic 
mechanisms for the problem have been suggested by several works, and the algorithm 
presented here is by Andelman, Azar, and Sorani (2005). The current best approxi- 
mation ratio, 3, is given by Kovacs (2005). Section 12.3 is based on the work of Lavi 
and Swamy (2005). Roberts (1979) characterized dominant strategy implementability 
for unrestricted domains. The proof given here is based on Lavi, Mu’alem, and Nisan 
(2004). Generalized-WMON was suggested by Lavi, Mu’alem, and Nisan (2003), 
which explored the same characterization question for restricted domains in general, 
and for CAs in particular. Section 12.5 is based on the work of Babaioff, Lavi, and 
Pavlov (2006). There have been several other suggestions for alternative solution con- 
cepts. For example, Kothari et al. (2005) describe an “almost truthful” deterministic 
FPAS for multiunit auctions, and Lavi and Nisan (2005) define a notion of “Set-Nash” 
for multi-unit auctions in an online setting, for which they show that deterministic truth- 
fulness obtains significantly lower approximations than Set-Nash implementations. 
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Exercises 


12.1 (Scheduling related machines) Find an implementable algorithm that exactly ob- 
tains the optimal makespan, for scheduling on related machines (since this is an 
NP-hard problem, obviously you may ignore the computational complexity of your 
algorithm). 


12.2 (Scheduling unrelated machines) In the model of unrelated machines, each job / 
creates a load pj; on each machine /, where the loads are completely unrelated. 
Prove, using W-MON, that no truthful mechanism can approximate the makespan 
with a factor better than 2. Hint: Start with four jobs that have pj; = 1 for all /, /. 

12.3. A deterministic greedy rounding of the fractional scheduling 12.4 assigns each 
job in full to the first machine that got a fraction of it. Explain why this is a 2- 
approximation, and show by an example that this violates monotonicity. 


12.4 


12.5 


12.6 


12.7 
12.8 
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Prove that 1-CA-PROC of Definition 12.46, and Greedy for multiminded players 
of Definition 12.20 are not dominant-strategy implementable. 


(Converting algorithms to mechanisms) Fix an alternative set A, and suppose that 
for any player i, there is a fixed, known subset A; C A, such that a valid valua- 
tion assigns some positive real number in [Umin, Umax] to every alternative in Aj, 
and zero to the other alternatives. Suppose vmin and umax are known. Given a 
c-approximation algorithm to the social welfare for this domain, construct a ran- 
domized truthful mechanism that obtains a O(log(Umax/Umin) - ©) approximation to 
the social welfare. (Hint: choose a threshold price, uniformly at random). Is this 
construction still valid when the sets A; are unknown? (If not, show a counter 
example). 


Describe a domain for which there exists an implementable social choice function 
that does not satisfy Generalized-WMON. 


Describe a deterministic CA for general valuations that is not an affine maximizer. 


This exercise aims to complete the characterization of Section 12.4: 

Let y(x, y) =inf{peEMR| p-1 € Pix, y) }. Show that y(x, y) is well-defined, that 
v(x, y) = —y(y, x), and that y(x, 2) = y(x, y) + y(y, z). Let C(x, y) = {a — v(x, y)- 
1 | a € P(x, y) }. Show that for any x, y, w, z€ A, the interior of C(x, y) is equal to 
the interior of C(w, z). Use this to show that C(x, y) is convex. 

Conclude, by the separation lemma, that f is an affine maximizer (give an explicit 
formula for the additive terms C,). 


CHAPTER 13 


Profit Maximization 
in Mechanism Design 


Jason D. Hartline and Anna R. Karlin 


Abstract 


We give an introduction to the design of mechanisms for profit maximization with a focus on single- 
parameter settings. 


13.1 Introduction 


In previous chapters, we have studied the design of truthful mechanisms that implement 
social choice functions, such as social welfare maximization. Another fundamental 
objective, and the focus of this chapter, is the design of mechanisms in which the goal 
of the mechanism designer is profit maximization. In economics, this topic is referred 
to as optimal mechanism design. 

Our focus will be on the design of profit-maximizing auctions in settings in which 
an auctioneer is selling (respectively, buying) a set of goods/services. Formally, there 
are n agents, each of whom desires some particular service. We assume that agents 
are single-parameter; i.e., agent i’s valuation for receiving service is v; and their 
valuation for no service is normalized to zero. A mechanism takes as input sealed 
bids from the agents, where agent i’s bid b; represents his valuation v;, and computes 
an outcome consisting of an allocation x = (x,,...,X,) and prices p = (p1,..-., Dn). 
Setting x; = 1 represents agent i being allocated service whereas x; = 0 is for no 
service, and p; is the amount agent 7 is required to pay the auctioneer. We assume that 
agents have quasi-linear utility expressed by u; = v;x; — p;. Thus, an agent’s goal in 
choosing his bid is to maximize the difference between his valuation and his payment. 

To make this setting quite general, we assume that there is an inherent cost c(x) in 
producing the outcome x, which must be paid by the mechanism. Our goal is to design 
the mechanism, i.e., the mapping from bid vectors to price/allocation vectors so that 
the auctioneer’s profit, defined as 


Profit = y Di — C(x), 
is maximized, and the mechanism is truthful. 
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Many interesting auction design problems are captured within this single-parameter 
framework. In what follows, we describe a number of these problems, and show that, 
for most of them, the VCG mechanism (Chapter 9), which maximizes social welfare, 
is a poor mechanism to use when the goal is profit maximization. 


Example 13.1 (single-item auction) We can use the cost function c(x) to cap- 
ture the constraint that at most one item can be allocated, by setting c(x) = 0 if 
>°; x1 < 1 and oo otherwise. The profit of the Vickrey auction (Chapter 9) is the 
second highest of the valuations in the vector v. If prior information about agents’ 
valuations is available, then there are auctions with higher profit than the Vickrey 
auction. 


Example 13.2 (digital goods auctions) In a digital goods auction, an auction- 
eer is selling multiple units of an item, such as a downloadable audio file or a 
pay-per-view television broadcast, to consumers each interested in exactly one 
unit. Since the marginal cost of duplicating a digital good is negligible and digital 
goods are freely disposable, we can assume that the auctioneer has an unlimited 
supply of units for sale. Thus, for digital goods auctions c(x) = 0 for all x. 

The profit of the VCG mechanism for digital goods auctions is zero. Indeed, 
since the items are available in unlimited supply, no bidder places any externality 
on any other bidder. 


Example 13.3 (single-minded combinatorial auction, known bundles) In a 
combinatorial auction with single-minded agents, each agent has exactly one bun- 
dle of items that he is interested in obtaining. Agent i’s value for his desired bundle, 
S;,1S v;. We use the cost function c(x) to capture the constraint that each item can be 
allocated to at most one bidder. Thus, c(x) = Oif Vi, j, 55 NS; # OB > xix; = 0, 
and c(x) = oo otherwise. 


Example 13.4 (multicast auctions) Consider a network with users residing at 
the nodes in the network, each with a valuation for receiving a broadcast that 
originates at a particular node, called the root. There are costs associated with 
transmitting data across each of the links in the network — the cost of transmitting 
across link e is c(e). Our problem is then to design an auction that chooses a 
multicast tree, the set of users to receive the broadcast, and the prices to charge 
them. In this setting, c(x) is the total cost of connecting all of the agents with 
x; = | to the root (1.e., the minimum Steiner tree cost). 

In most nondegenerate instances of this problem the VCG mechanism will run 
a deficit. One such example is the public project setting described in Chapter 9, 
Section 3.5 which can be mapped to a network with a single link of cost C, where 
one endpoint is the root and all the users are at the other endpoint. 


All of the other examples detailed in Chapter 9, Section 3.5, i.e., reverse auctions, 
bilateral trade, multiunit auctions, and buying a path in a network, as well as many 
other problems can be modeled in this single-parameter agent framework. 
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13.1.1 Organization 


Our discussion of optimal mechanism design will be divided up into three categories, 
depending on our assumptions about the agents’ private values. On one hand, as is 
typical in economics, we can assume that agents’ private values are drawn from a 
known prior distribution, the so-called Bayesian approach. Given knowledge of these 
prior distributions, the Bayesian optimal mechanism is the one that achieves the largest 
expected profit for these agents, where the expectation is taken over the randomness 
in the agents’ valuations. In Section 13.2, we present the seminal result of Myerson, 
showing how to design the optimal, i.e., profit-maximizing, Bayesian auction given the 
prior distribution from which bidders’ valuations are drawn. 

On the other hand, in many cases, determining these prior distributions in advance 
may not be convenient, reasonable, or even possible. It is particularly difficult to collect 
priors in small markets, where the process of collecting information can seriously 
impact both the incentives of the agents and the performance of the mechanism. Thus, 
it is of great interest to understand to what extent we are able to to design mechanisms 
for profit maximization even when we know very little about bidders’ valuations. This 
approach leads us to the more traditional computer science approach of “worst-case 
analysis.” While worst-case analysis could lead to results that are overly pessimistic, 
we shall see that in many cases we are able to obtain worst-case guarantees that 
are comparable to the optimal average-case guarantees for valuations from known 
distributions. 

We begin our exploration of worst-case analysis in Section 13.3, where we survey 
techniques for approximating the optimal mechanism. We give natural mechanisms 
that approach optimality on large markets and a general formula for their performance 
as a function of the market size for small markets. 

To obtain a theory of optimal mechanisms design without assumptions on the size of 
the market, we adopt a framework of relative optimality. This is motivated by two key 
observations. First, as we will explain later, there is no truthful mechanism that is best 
on every input. Second, in the worst case, all the agents’ private values could be zero 
(or negligible) and thus no auction will be able to extract a high profit. In Section 13.4, 
we describe techniques for designing auctions that always (in worst case) return a profit 
that is within a small constant factor of some profit benchmark evaluated with respect 
to the agents’ true private values. 

Finally, in Section 13.5, we consider procurement settings where the auctioneer 
is looking to buy a set of goods or services that satisfy certain constraints, e.g., a 
path or a spanning tree in a graph. Specifically, we consider the problem of designing 
procurement auctions to minimize the total cost of the auctioneer (i.e., maximize their 
profit) relative to a natural benchmark. 

We conclude the chapter with a discussion of directions for future research. 


13.1.2 Preliminaries 


In this section, we review basic properties of truthful mechanisms. 
We will place two standard assumptions on our mechanisms. The first, that they 
are individually rational, means that no agent has negative expected utility for taking 
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part in the mechanism. The second condition we require is that of no positive transfers 
which restricts the mechanism to not pay the agents when they do not win, i.e., 
xi = 0—> Di= 0. 

In general, we will allow our mechanisms to be randomized. In a randomized 
mechanism, x; is the probability that agent 7 is allocated the good, and p; is agent i’s 
expected payment. Since x; and p; are outputs of the mechanism, it will be useful to view 
them as functions of the input bids as follows. We let x;(b), p;(b), and u;(b) represent 
agent i’s probability of allocation, expected price, and expected utility, respectively. 
Let b_; = (1, ..., bj-1, ?, bj41,..-, Bn) represent the vector of bids excluding bid i. 
Then with b_; fixed, we let x;(b;), p;(b;), and u;(b;) represent agent i’s probability of 
allocation, expected price, and expected utility, respectively, as a function of their own 
bid. We further define the convenient notation x;(b;, b_;) = x;(b), p;(b;, b_;) = p;(b), 
and u;(b;, b_;) = u;(b). 


Definition 13.5 A mechanism is truthful in expectation if and only if for all i, 
v;, b;, and b_;, agent i’s expected utility for bidding their valuation, v;, is at least 
their expected utility for bidding any other value. In other words, 


uj(v;, b_;) > uj(b;, b_i). 


For single-parameter agents, we restate the characterization of truthful mechanisms 
which was proven in Chapter 9, Section 5.6. 


Theorem 13.6 A mechanism is truthful in expectation if and only if, for any 
agent i and any fixed choice of bids by the other agents b_;, 


(i) x;(b;) is monotone nondecreasing. 
(ti) pi(bi) = Bix;i(bi) — ie Xi(Z) dz. 


Given this theorem, we see that once an allocation rule x(-) is fixed, the pay- 
ment rule p(-) is also fixed. Thus, in specifying a mechanism we need specify 
only a monotone allocation rule and from it the truth-inducing payment rule can be 
derived. 

It is useful to specialize Theorem 13.6 to the case where the mechanism is determin- 
istic. In this case, the monotonicity of x;(b;) implies that, for b_; fixed, there is some 
threshold bid t; such that x;(b;) = 1 for all b; > t; and O for all t; < b;. Moreover the 
second part of the theorem then implies that for any b; > t;, p;(b;) = b; — ie dz=t;. 
We conclude the following. 


Observation 13.1.1. Any deterministic truthful auction is specified by a set of 
functions t;(b_;) which determine, for each bidder i and each set of bids b_;, an 
offer price to bidder i such that bidder i wins and pays price t; if bj > t;, or loses 
and pays nothing if b; < t;. (Ties can be broken arbitrarily.) 
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13.2 Bayesian Optimal Mechanism Design 


In this section we describe the conventional economics approach of Bayesian optimal 
mechanism design where it is assumed that the valuations of the agents are drawn from a 
known distribution. The mechanism we describe is known as the Myerson mechanism: 
it is the truthful mechanism that maximizes the auctioneer’s expected profit, where the 
expectation is taken over the randomness in the agents’ valuations. 

Consider, for example, a single-item auction with two bidders whose valuations are 
known to be drawn independently at random from the uniform distribution on [0, 1]. 
In Chapter 9, Section 6.3, it was shown that in this setting the expected revenue of both 
the Vickrey (second-price) auction and of the first-price auction is 1/3. In fact, it was 
observed that any auction that always allocates the item to the bidder with the higher 
valuation achieves the same expected revenue. 

Does this mean that 1/3 is the best we can do, in expectation, with bidders of this 
type? The answer is no. Consider the following auction. 


Definition 13.7 (Vickrey auction with reservation price r) The Vickrey auc- 
tion with reservation price r, VA,, sells the item to the highest bidder bidding at 
least r. The price the winning bidder pays is the maximum of the second highest 
bid and r. 


It is a straightforward probabilistic calculation to show that the expected profit of 
the Vickrey auction with reservation price r = 1/2 is 5/12. Thus, it is possible to get 
higher expected profit than the Vickrey auction by sometimes not allocating the item! 
This raises the problem of identifying, among the class of all truthful auctions, the 
auction that gives the optimal profit in expectation. The derivation in the next section 
answers this question and shows that in fact for this scenario VAj,2 is the optimal 
auction. 


13.2.1 Virtual Valuations, Virtual Surplus, and Expected Profit 


We assume that the valuations of the agents, v;,..., v,, are drawn independently at 
random from known (but not necessarily identical) continuous probability distributions. 
For simplicity, we assume that v; € [0,4] for all i. We denote by F; the distribution 
function from which bidder i’s valuation, v;, is drawn (1.e., F;(z) = Pr[v; < z]) and 
by f; its density function (ie., f;(z) = 4 F;(z)). Since the agents’ valuations are 
independent, the joint distribution from which v is drawn is just the product distribution 
F=F,x::-xF,. 
We now define two key notions: virtual valuations and virtual surplus. 


Definition 13.8 = The virtual valuation of agent i with valuation 1; is 


1 — Fi) 


Pi(vj) = vj — fi) 
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Definition 13.9 | Given valuations, v;, and corresponding virtual valuations, 
$(v;), the virtual surplus of allocation x is }°>, 6;(v;)x; — c(x). 


As the surplus of an allocation is }°; v;x; — c(x), the virtual surplus of an allocation is 
the surplus of the allocation with respect to agents whose valuations are replaced by 
their virtual valuations, ¢;(v;). 

We now show that any truthful mechanism has expected profit equal to its expected 
virtual surplus. Thus, to maximize expected profit, the mechanism should choose an 
allocation which maximizes virtual surplus. In so far as this allocation rule is monotone, 
this gives the optimal truthful mechanism! 


Theorem 13.10 = The expected profit of any truthful mechanism, M, is equal to 
its expected virtual surplus, i.e., Ey[M(v)] = Ey[>); di(vi)xi(v) — c(x(v))]. 


Thus, if the mechanism, on each bid vector b, chooses an allocation, x, which 
maximizes )°, $;(b;)x; — c(x), the auctioneer’s profit will be maximized. Notice that 
if we employ a deterministic tie-breaking rule then the resulting mechanism will be 
deterministic. Theorem 13.10 follows from Lemma 13.11 below, and the independence 
of the agents’ valuations. 


Lemma 13.11 = Consider any truthful mechanism and fix the bids b_; of all 
bidders except for bidder i. The expected payment of a bidder i satisfies: 


E,,[pi(hi)] = Ey, [6 (61) xi(b,)) - 


PROOF To simplify notation, we drop the subscript i and refer simply to the bid 
b being randomly chosen from distribution F with density function f. 
By Theorem 13.6, we have 


h 


h h b 
E,[p(b)] = if _ Pub) f(b) db = / _ bx) FO) ab — i : i _ LO) deb. 


Focusing on the second term and switching the order of integration, we have 


h h 


h 
E,[p(b)| = f _bx(b) f(b) db — / x(z) | f(b) dbdz. 


h h 
é if bx(b) f(b) db — i x(2) [1 — F()) dz. 
b z 


=0. 


Now, we rename z to b and factor out x(b) f(b) to get 


h 


h 
E,[p(0)] = _bxibyfibyab — | x(b) [1 — F(b)] db. 


* i ay 
= ba a) Bab: 
I, wore 


= E,[¢(b)x()]. 
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13.2.2 Truthfulness of Virtual Surplus Maximization 


Of course, it is not immediately clear that maximizing virtual surplus results in a 
truthful mechanism. By Theorem 13.6, this depends on whether or not virtual surplus 
maximization results in a monotone allocation rule. Recall that the VCG mechanism, 
which maximizes the actual surplus, i.e., )>; vjx; — c(x), is truthful precisely because 
surplus maximization results in a monotone allocation rule. Clearly then, virtual surplus 
maximization gives an allocation that is monotone in agent valuations precisely when 
virtual valuation functions are monotone in agent valuations. Indeed, it is easy to find 
examples of the converse which show that nonmonotone virtual valuations result in a 
nonmonotone allocation rule. Thus, we conclude the following lemma. 


Lemma 13.12. Virtual surplus maximization is truthful if and only if, for alli, 
o;(v;) is monotone nondecreasing in v;. 


A sufficient condition for monotone virtual valuations is implied by the monotone 
hazard rate assumption. The hazard rate of a distribution is defined as f(z)/(1 — F(z)). 
Clearly, if the hazard rate is monotone nondecreasing, then the virtual valuations are 
monotone nondecreasing as well. There is a technical construction that extends these 
results to the nonmonotone case, but we do not cover it here. 


Definition 13.13 Let F be the prior distribution of agents’ valuations satisfying 
the monotone hazard rate assumption. We denote by Mye,(b) the Myerson mech- 
anism: on input b, output x to maximize the virtual surplus (defined with respect 
to the distribution F). 


Thus, for single parameter problems, profit maximization in a Bayesian setting 
reduces to virtual surplus maximization. This allows us to describe Myerson’s optimal 
mechanism, Mye,(b), as follows: 


(i) Given the bids b and F, compute “virtual bids”: b; = ;(b;). 
(ii) Run VCG on the virtual bids b’ to get x’ and p’ 
(iii) Output x = x’ and p with p; = 6; '(p'). 


13.2.3 Applications of Myerson’s Optimal Mechanism 


The formulation of virtual valuations and the statement that the optimal mechanism is 
the one that maximizes virtual surplus is not the end of the story. In many relevant cases 
this formulation allows one to derive very simple descriptions of the optimal mecha- 
nism. We now consider a couple of examples to obtain a more precise understanding 
of Mye,(b) and illustrate this point. 


Example 13.14 (single-item auction) In a single-item auction, the surplus 
maximizing allocation gives the item to the bidder with the highest valuation, 
unless the highest valuation is less than 0 in which case the auctioneer keeps the 
item. Usually, we assume that all bidders’ valuations are at least zero, or they 
would not want to participate in the auction, so the auctioneer never keeps the item. 
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However, when we maximize virtual surplus, it may be the case that a bidder 
has positive valuation but negative virtual valuation. Thus, for allocating a single 
item, the optimal mechanism finds the bidder with the largest nonnegative virtual 
valuation if there is one, and allocates to that bidder. 


What about the payments? Suppose that there are only two bidders and we 
break ties in favor of bidder 1. Then bidder 1 wins precisely when ¢)(b,) > 
max{@2(b2), 0}. This is a deterministic allocation rule, and thus the payment that 
a winning bidder 1 must make is the p; = inf{b : $,(b) > ¢2(b2) A o(b) = O}. 
Suppose that F, = fF, = F, which implies that $)(z) = ¢2(z) = $(z). Then 
this simplifies to p; = min(b:, @~'(0)). Similarly, bidder 2’s payment upon 
winning is p2 = min(b,,¢~'(0)), thus we arrive at one of Myerson’s main 
observations. 


Theorem 13.15 The optimal single-item auction for bidders with valuations 
drawn 1.i.d. from distribution F is the Vickrey auction with reservation price 
@—'(0), 1.é., VAg-1(0)- 


For example, when F is uniform on [0, 1], we can plug the equations F(z) = z 
and f(z) = 1 into the formula for the virtual valuation function (Definition 13.8) to 
conclude that @(z) = 2z — 1. Thus, the virtual valuations are uniformly distributed on 
[—1, 1]. We can easily solve for @~'(0) = 1/2. We conclude that the optimal auction 
for two bidders with valuations uniform on [0, 1] is the Vickrey auction with reservation 
price 1/2, VAj/2. 


Example 13.16 (Digital goods auction) Recall that in a digital goods auction, 
we have c(x) = 0 for all x. Thus, to maximize virtual surplus, we allocate to each 
bidder such that ¢;(b;) > 0. As in the previous example, the payment a winning 
bidder must make is his minimum winning bid, 1.e., inf{b : ¢;(b) > O}, which is 
identically ¢; '(0). 

Notice that with n bidders whose valuations are drawn independently from 
the same distribution function F, the reserve price for each bidder is ~'(0), the 
solution to b — —- 1 = 0. It is easy to check that this is precisely the optimal 
sale price for the distribution F’: the take-it-or-leave-it price we would offer each 
bidder to maximize our expected profit. 


Definition 13.17 (optimal sale price) The optimal sale price for distribution F 
is opt(/’) = argmax, z(1 — F(z)). 


Summarizing, we obtain: 


Theorem 13.18 The optimal digital goods auction for n bidders with valuations 
drawn i.i.d. from distribution F is to offer each bidder the price opt(F) = ¢~'(0). 
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13.3 Prior-Free Approximations to the Optimal Mechanism 


In the previous section, we saw how to design the optimal mechanism when agents’ 
valuations were drawn from known distributions. The assumption that the valuations 
are drawn from a known prior distribution makes sense in very large markets. In 
fact, as we shall see shortly, in large enough markets, a good approximation to the 
prior distribution can be learned on-the-fly and thus there are prior-free mechanisms 
that obtain nearly the optimal profit. We discuss these results in the first part of this 
section. 

In small markets, on the other hand, incentives issues in learning an approximation 
of the prior distribution result in loss of performance and fundamental mechanism 
design challenges. Thus, new techniques are required in these settings. We develop an 
approach based on random sampling and analyze its performance in a way that makes 
explicit the connection between the size of the market and a mechanism’s performance. 


13.3.1 Empirical Distributions 


The central observation that enables effective profit maximization without priors is 
Observation 13.1.1, which says that a truthful mechanism can use the reported bids of 
all the other agents in order to make a pricing decision for a particular agent. 


Definition 13.19 (empirical distribution) For a vector of bids b= 
(bj,...,b,), the empirical distribution for these bids is Fy satisfying for 
X ~ Fy, Pr[X > z] =n,/n, where n, is the number of bids in b above value z. 


We now present a variant on Myerson’s mechanism that can be used without any 
prior knowledge. As we shall see below, this mechanism has interesting interpretations 
in several contexts. 


Definition 13.20 (empirical Myerson mechanism) The empirical Myerson 
mechanism, EM on input b, for each i, simulates Mye kh (b) to obtain out- 
come x) and payments p™. It then produces outcome x and p with x; = x(” and 


pi = pr”. 


The outcome and payment for agent i in the empirical Myerson mechanism is 
based on the simulation of Mye i, (b), and since agent 7 cannot manipulate Fy_,, this 
mechanism is truthful. 

There are two issues that we need to address in order to understand the performance 
of the EM mechanism. First, we need to see if the outcomes it produces are feasible. The 
issue is that the allocation to different agents, say i and j, is determined from different 
information (b_; versus b_;). As we shall see, this inconsistency will sometimes 
produce allocations, x, that are not feasible (i.e., c(x) = oo). Second, in those situations 
where it does produce feasible allocations, we need to understand how effective the 
mechanism is at profit maximization. The hope is that, in large markets, Fy_, should 
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be close to Fy and hence the law of large numbers should imply good performance. 
Again, we will see that this does not hold in general. 

We begin by considering the application of EM to digital goods auctions, where 
there is no feasibility issue. 


Definition 13.21 (deterministic optimal price auction) We define the deter- 
ministic optimal price auction (DOP) as EM applied to the digital goods auction 
problem. 


In the previous section, we saw that if each agent’s valuation is drawn from the 
same distribution F, Myerson’s mechanism offers price ¢~'(0) = opt(F) to each bid- 
der. The deterministic optimal price auction, on the other hand, offers agent i price 
opt(F)p_,). Using the short-hand notation opt(b) = opt( Fy), Observation 13.1.1 allows 
us to express DOP quite simply as the auction defined by 1;(b_;) = opt(b_;). Since 
b_; is different for each agent, in general the prices offered the agents are different. 
Nonetheless, the law of large numbers implies the following result (which is a corollary 
of Theorem 13.30 proved in the next section). 


Theorem 13.22 For the digital goods setting and n bids b distributed i.i.d. from 
distribution F with bounded support, the profit of DOP approaches the profit of 
Myerson in the limit as n increases. 


Unfortunately, the assumption that the input comes from an unknown, but i.i.d. dis- 
tribution is crucially important to this result as the following example shows. 


Example 13.23 With 10 bids at $10, and 90 bids at $1, consider the prices 
t;(b_1) and t;(b_j9) that DOP offers bidders bidding $1 and $10 respectively: 


¢ b_, is 89 bids at $1 and 10 bids at $10, so opt(b_;) = $10, and 
¢ b_10 is 90 bids at $1 and 9 bids at $10, so opt(b_19) = $1. 


Thus, bids at $10 are accepted, but offered price $1, while bids at $1 are rejected. 
The total profit is $10 whereas the optimal is $100. This example can be made 
arbitrarily bad. 


What happened in this example is the result of the inconsistency between the 
distribution /,_, assumed when choosing a price for agent i, and the distribution /),_, 
assumed when choosing a price for agent j. Had we just run Mye F,_, OF Mye F_, 00 all 
bids, all would have been well. Indeed, in this example, we would have chosen either 
price $1 for everyone or price $10 for everyone. Both prices would have been fine. 

This problem is not just one with DOP, but with any symmetric deterministic digital 
goods auction.' Indeed, the problem inherent in this example can be generalized to 
prove the following theorem. 


' An auction is symmetric if the outcome and prices are not a function of the order of the input bids, but rather 
just the set of bids. 


PRIOR-FREE APPROXIMATIONS TO THE OPTIMAL MECHANISM 341 


Theorem 13.24 There do not exist constants B and y and a symmetric deter- 
ministic truthful auction, A, with profit at least OPT /B — hy on all bid vectors 
b with b; € [1, h). 


The inconsistency of EM can be more serious than just low profit on some, perhaps 
unlikely, inputs; if some outcomes are infeasible (i.e., c(x) = oo for some x) then EM 
may result in infeasible outcomes! In the next section we see how these consistency 
issues can be partially addressed through the use of random sampling. 


13.3.2 Random Sampling 


Random sampling plays an important role in the design of economic mechanisms. For 
example, during elections, polls that predict each candidate’s ranking affect the results 
of the elections; and in many settings, market analysis and user studies using a (small) 
random sample of the population can lead to good decisions in product development 
and pricing. In this section, we consider a natural extension of the empirical Myerson 
mechanism that uses random sampling to address the consistency issues raised in the 
preceding section. 


Definition 13.25 (Random sampling empirical myerson) The random sam- 
pling empirical Myerson mechanism (RSEM) works as follows: 
(i) Solicit bids b = (bi, ..., Dn). 
(ii) Partition the bids into two sets b’ and b” uniformly at random. 
(iii) Compute empirical distributions for each set F’ = Fy and F” = Fy. 
(iv) Run Mye,;,,(b’) and Mye;,(b”). 


For digital goods auctions, we can replace Steps iii and iv by their more natural 
interpretations (facilitated by the short-hand notation opt(b) = opt(/p)): 


(iii)’ Compute the optimal sale prices p’ = opt(b’) and p” = opt(b”). 
(iv)’ Offer price p’ to bidders in b” and price p” to bidders in b’. 


We refer to the digital goods variant of the random sampling empirical Myerson 
mechanism as the random sampling optimal price auction (RSOP). The randomization 
in RSOP allows it to bypass the deterministic impossibility for worst case settings 
leading to the following theorem. (Again, this is as a corollary of Theorem 13.30 
which is proven in the next section.) 


Theorem 13.26 For b with b; € [1,h], the expected revenue of RSOP ap- 
proaches that of the optimal single price sale as the number of bidders grows. 


Similar results do not necessarily hold for more general settings. It is easy to imag- 
ine situations where RSEM also gives infeasible outcomes as the following example 
illustrates. 
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Example 13.27. Consider the setting where we are selling a digital good in one 
of two markets, for convenience, bidders 1,...,i are in market A and bidders 
i+41,...,n are in market B. Due, perhaps, to government regulations, it is not 
legal to sell the good to bidders in both markets simultaneously. Thus, feasible 
solutions will have winners either only from market A or only from market B. 
It is easy to construct settings where RSEM will sell to one market in b’ and the 
other in b”. The combined outcome, however, is not feasible. 


The biggest open problem in prior-free mechanism design is to understand how to 
approximate the optimal mechanism in more general settings. 


13.3.3 Convergence Rates 


As we have discussed above, the law of large numbers implies that the profit of the 
random sampling auction, say in the case of digital goods, is asymptotically optimal 
as the number of bidders grows. In this section, we study the rate at which the auction 
approaches optimal performance. The theorem we prove will enable us to obtain a 
precise relationship between the complexity of the class of outcomes considered by 
RSOP and its convergence rate. The results in this section will also give us a framework 
for evaluating the performance of random sampling-based mechanisms in very general 
contexts. 

We make our discussion concrete, using the example of the digital goods auction 
problem. Recall that RSOP uses a subroutine that computes the optimal sale price for 
the bids in each partition of the bidders. Suppose that we allowed the auctioneer the 
ability to artificially restrict prices to be in some set Q. For example, the auctioneer 
might only sell at integer prices, in which case Q would be the set of integers. The 
auctioneer could further limit the set of possible prices, for example, by having Q be 
powers of 2. We will see that different choices of Q will give us different bounds on 
the convergence rate. 

Given Q, we define RSOPg as the random sampling auction that computes the 
optimal price from Q on each partition and offers it to bidders in the opposite partition. 
We make use of the following notation. Let q(b;) be the payment made by bidder i 
when offered g € Q. That is, g(b;) = q if b; = q and q(b;) = 0 otherwise. Let g(b) = 
>=; ¢(0;). Finally, define opto(b) = argmax,<g q(b) as the qg that gives the optimal 
profit for b, and OPT 9(b) to be this optimal profit, i.e., OPT o(b) = max,cg g(b). 

The bounds we give in this section show the rate at which the profit of RSOPo(b) 
approaches OPT 9(b) with some measure of the size of the market. The measure we 
use is OPT g itself, as this gives us the most general and precise result. Thus, these 
results show the degree to which RSOPg approximates OPT 9 as OPT9 grows large in 
comparison to f, an upper bound on the payment of any agent, and the complexity of 


Q. 


Definition 13.28 Given partitions b’ and b”, price g in Q is €-good if 


|q(b’) — q(b”)| < € OPTo(b) 
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Lemma 13.29 For b and h satisfying q(b;) < h, for all i, if bids b are ran- 


domly partitioned into b’ and b" then q is not €-good with probability at most 
Qe—€ OPT o(b)/2h | 


The proof of this lemma follows from McDiarmid’s inequality, see Exercise 13.5. 
The following is the main theorem of this section. A natural interpretation of h is an 
upper bound on the highest valuation, i.e., 4 = max; vj. 


Theorem 13.30 = For QO, b, and h satisfying q(b;) <h, for all q and i, and 
OPT o(b) = se In (221), with probability 1 — 6 the profit of RSOPg is at least 
(1 — €) OPTo(b). 


PROOF Assume that OPTg(b) > sh In(22!), For random partitioning of b into 
b’ and b’, Lemma 13.29 implies that the probability g € Q is not 5-good is at 
most 6/|Q|. Using a union bound over all g € Q, we have that all g € QO are 
5-good with probability 1 — 6. 

Let q' = optg(b’), g” = optg(b”), and g* = optg(b). By definition, g'(b’) = 
q*(b’) and likewise q”(b”) > q*(b”). Thus, g'(b’) + q"(b”) = q*(b) = OPTo(b). 
Ifallq are 5-good, certainly q’ and q” are; therefore, q'(b”) > q'(b’) — § OPTa(b) 
and q"(b’) > q’(b") — 5 OPTo(b). Thus, we conclude that our auction profit, 
which is q’(b”) + q’(b’) is at least (1 — €) OPT g(b) with probability 1 — 5 which 
gives the theorem. 


Notice that this theorem holds for all € and 4. In particular, it shows how big 
the optimal profit must be before we can guarantee a certain approximation fac- 
tor. Of course, in the limit as the optimal profit goes to infinity, our approxima- 
tion factor approaches one. We refer to the lower bound required of optimal profit, 
OPT og, in the statement of the theorem as the convergence rate. Indeed, if the 
agents’ valuations are between | and h, the lower bound on the optimal profit can 
be translated into a lower bound on the size of the market needed to guarantee the 
approximation. 

Let us now consider a few applications of the theorem: Suppose that QO = {1,..., A}. 
Then |Q| = h and the convergence rate to a (1 — €)-approximation with probability 
1—6 is O(he~* log(2h/5)). If instead Q is powers of 2 on the interval [1, /], then 
|Q| = log h and the convergence rate for constant € and 6 is O(h log log h). 

It is worth noting that the particular bids b that are input to any particular run of 
RSOPo may further restrict the set of possible prices in Q that can be selected, say to 
some subset Q’. We can apply Theorem 13.30 retrospectively to input b to bound the 
performance of RSOPg in terms of |Q’|. For example, in the original RSOP auction 
we consider all real numbers as prices; yet, opt(b) is always one of the bids. Thus, 
using Q’ = {b,,..., b,} and noting that |’ = n, tells us that the convergence rate of 
our original RSOP digital good auction is O(he~* In(2n/6)). Even better bounds are 
possible using a notion called y-covers (Exercise 13.6). 
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Corollary 13.31 For Q, Q', b, and h satisfying q(b;) < h for all q and i, 
opt(b’) € Q' for all subsets b’ of b, and OPTo(b) > £4 1n(721); with probability 
1 — 6 the profit of RSOPg is at least (1 — €) OPT g(b). 


Lemma 13.29 and Theorem 13.30 are quite general and can be applied, as written, to 
a wide variety of unlimited supply auction problems with rich structure on the class of 
allowable offers, Q. Two examples are attribute auctions and combinatorial auctions. 


13.4 Prior-Free Optimal Mechanism Design 


In the previous sections, anumber of results on approximating the optimal mechanism in 
worst-case settings were presented. Unfortunately, these results remain limited in their 
applicability. For example, what if OPT 9(b) is too small, as might happen if the size 
of the market (1.e., the number of bidders) is too small? In such cases, Theorem 13.30 
may give us no guarantee. Thus, a natural question to ask is: what is the best truthful 
mechanism? Can we design a truthful mechanism for which we can prove nontrivial 
performance guarantees under any market conditions? 

The first observation that must be made is that there is no such thing as an absolute 
“best” truthful auction. To gain some intuition for this statement, recall that in any 
truthful auction, the offer price t; to bidder i is a function of all other bids b_;, but 
not of b;. Thus, given any particular auction, which is forced to fix the offer price 1; 
independently of b;, (and hence always performs suboptimally for most values of b;), 
there is always some input on which a different truthful auction performs better (see 
Exercise 13.8). 

Given that there is no absolute best truthful mechanism on all inputs, we are left 
with the question of how we can arrive at a rigorous theoretical framework in which 
we can compare auctions and determine one to be better. The key to resolving this 
issue is in moving from absolute optimality to relative optimality. Indeed, whenever 
there is an information theoretic obstacle or computational intractability preventing 
an absolute optimal solution to a problem we can try to approximate. For example, 
in the design of online algorithms the objective is to find an online algorithm that 
performs comparably to an optimal offline algorithm. The notable analogy here is 
between the game theoretic constraint that a mechanism does not know the true bid 
values in advance and must solicit them in a truth-inducing manner, and the online 
constraint that an online algorithm does not have knowledge of the future. 


13.4.1 Competitive Framework 


The general approach will be to try to design an auction with profit that is always (in 
worst case) within a small constant factor of some profit benchmark. 


Definition 13.32 A profit benchmark is a function G : R” — R which maps a 
vector of valuations to a target profit. 
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The following definition captures the intuition that an auction is good if it is always 
close to a reasonable profit benchmark. 


Definition 13.33. The competitive ratio of auction A (defined with respect to an 


implicit profit benchmark G) is 6 = sup, oe 


Given a profit benchmark G the task of an auction designer is to design an auction 
that achieves the minimum possible competitive ratio. This auction is the optimal 
competitive auction for G. 


13.4.2 A Competitive Digital Goods Auctions 


In this section, we will see that the RSOP auction that was defined in Section 13.3.2 is 
in fact a competitive digital goods auction. To make this statement precise, we first need 
to define the profit benchmark we will be attempting to compete with. In the analysis of 
online algorithms it is not always best to gauge the performance of an online algorithm 
by comparing it to an unconstrained optimal offline algorithm. Similarly, in the analysis 
of truthful auctions, it sometimes makes sense to compare an auction’s profit to a profit 
benchmark that is not necessarily the profit of the optimal algorithm that is given the 
bidders’ true valuations in advance. 

For digital goods auctions, natural profit benchmarks, such as (a) the maximum profit 
achievable with fully discriminating prices (where each bidder pays their valuation) or 
(b) the maximum profit achievable with a single price, are provably too strong in the 
sense that no truthful auction can be constant competitive with these benchmarks. 

Thus, the profit benchmark we will use is the following. 


Definition 13.34 (F™) The optimal single priced profit with at least two win- 
ners is 


F (vy) = max ive) 
i>2 
where v;) is the ith largest valuation. 


Theorem 13.24 in Section 13.3.1 can be extended to this setting to show: 


Corollary 13.35 No symmetric deterministic truthful auction has constant com- 
petitive ratio relative to the profit benchmark F. 


Thus, we turn to randomized auctions where we find the following theorem. 

Theorem 13.36 RSOP is 15-competitive with F. 

We will not prove Theorem 13.36 here as it is primarily a technical probabilistic 
analysis. We do note, however, that 15 is likely to be a loose upper bound. On the 


other hand, it is easy to see that RSOP cannot have a competitive ratio better than 4, by 
considering the bid vector b = ($1, $2). With probability 1/2 both bids end up in the 
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same part and the RSOP profit is 0. Otherwise, with probability 1/2 one bid is in each 
part. Without loss of generality, b’ = {$1} and b” = {$2}, then p’ = $1 and p” = $2. 
Thus, the $1-bid is rejected (because she cannot pay $2) and the $2-bid is offered a 
price of $1 which she accepts. The RSOP profit in this case is $1. The expected profit 
of RSOP is therefore $0.50 while F(b) = $2, which shows that RSOP is at best 
4-competitive. It is conjectured that this two bid input is in fact the worst case and that 
RSOP has a competitive ratio of 4. 


13.4.3, Lower Bounds 


Now that we have seen that there exists an auction that has constant competitive ratio 
to F, it is interesting to ask: what is the optimal auction in terms of worst case 
competitive ratio to F°? What is the competitive ratio of this optimal auction? In this 
section, we approach this question from the other side, by looking for lower bounds 
on the competitive ratio. Specifically, we discuss a proof that shows that no auction is 
better than 2.42-competitive. 


Theorem 13.37 No auction has competitive ratio less than 2.42. 


The proof of this theorem involves a fairly complicated analysis of the expected value of 
Fb) when b is generated from a particular probability distribution. We will instead 
prove a simpler result which highlights the main ideas of the theorem. 


Lemma 13.38 No 2-bidder auction has competitive ratio less than 2. 


PROOF The proof follows a simple structure that is useful for proving lower 
bounds for this type of problem. First, we consider bids drawn from a particular 
distribution. Second, we argue that for any auction A, Ep[A(b)] < Ep [ss (b)] /2. 
This implies that there exists a bid vector b* such that A(b*) < F°(b*)/2. 

We choose a distribution to make the analysis of Ey[A(b)] simple. This is 
important because we have to analyze it for all auctions A. The idea is to choose the 
distribution for b so that all auctions obtain the same expected profit. Consider b 
with b; satisfying Pr[b; > z] = 1/z. Note that whatever the price ¢; is that A offers 
bidder 7, the expected payment made by bidder i is t; x Pr[b; > t;] = 1. Thus, for 
n = 2 bidders the expected profit of any truthful auction is E,[.A(b)] =” = 2. 

We must now calculate E,[F(b)]. F(b) = max;s2 iby) where by) is 
the ith highest bid value. In the case that n = 2, this simplifies to F°(b) = 
2b) = 2min(b,, bz). We recall that a nonnegative random variable X has 
E[X] = [> Pr[X = z]dz and calculate Pr[F(b) > z]. 


Prp[F(b) > z] = Prplhi > z/2 A bo > 2/2] 
_ Prp[h, > z/2] Prp[b2 > z/2) 
= 4/2’. 


PRIOR-FREE OPTIMAL MECHANISM DESIGN 347 


Note that this equation is valid only for z>2. Of course for z < 2, 
Pr[ F(b) = z| = 1. Thus, 


lone) od 
E,[F(b)] = i Pr[ Fb > z] dz =2 +f saz = 4. 
0 2 


For this distribution and any auction A, Ep[A(b)] = 2 and E,[F(b)] = 4. Thus, 
the inequality E,[A(b)] < E,y[F )(b)]/2 holds and there must exist some input 
b* such that A(b*) < F(b*)/2. 


For two bidders, this lower bound is tight. Indeed, it is trivial to check that for two 
bidders, the Vickrey auction has competitive ratio 2. 

The lower bound proof given above can be generalized by a more complicated 
analysis to larger n. Such an analysis leads to bounds of 13/6 for n = 3 and eventually 
to a bound of 2.42 for general n. It is conjectured that these bounds are tight. Indeed 
they are tight forn < 3. 


13.4.4 The Digital Goods Auction Decision Problem 


In the next sections, we derive an auction with a competitive ratio of 4. We do this 
by defining the notion of a decision problem for mechanism design and reducing the 
problem of designing a good competitive auction to it. 


Definition 13.39 The digital goods auction decision problem is: given n bidders, 
n units of an item, and a target profit R, design a truthful mechanism that obtains 
profit R if possible, ie., if R < F(v). Here, F(v) = max;>, ivg), where vq) is the 
ith largest valuation. 


This digital goods auction decision problem is also known as the profit extraction 
problem as its goal is to extract a profit R from a set of bidders. It turns out that this 
problem is solved by a special case of a general cost-sharing mechanism. 


Definition 13.40 (ProfitExtractr) The digital goods auction profit extractor 
with target profit R sells to the largest group of k bidders that can equally share 
R and charges each R/k. 


It is straightforward to show that ProfitExtract is truthful and obtains a profit of R 
when F(b) > R (see Exercise 13.10). 


13.4.5 Reduction to the Decision Problem 


A classical optimization problem can typically be phrased as follows: “find a feasible 
solution that maximizes some objective function.” The decision problem version of this 
is: “is there a feasible solution for which the objective function has value at least V?” A 
standard reduction between the two involves solving the decision problem many times, 
using binary search over values V. Unfortunately, such an approach will not work for 
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mechanism design as it is not truthful to run several truthful mechanisms and then only 
take the output of the one that is the most desirable. 
The following truthful auction avoids this problem. 


Definition 13.41 (RSPE) The Random Sampling Profit Extraction auction 
(RSPE) works as follows: 


(i) Randomly partition the bids b into two by flipping a fair coin for each bidder 
and assigning her to b’ or b”. 


(ii) Compute R’ = F(b’) and R” = F(b"), the optimal profits for each part. 
(iii) Run ProfitExtractg on b” and ProfitExtractr, on b’. 


The intuition for this auction is that ProfitExtractr allows us treat a set of bidders, 
b, as one bidder with bid value F(b). Recall that a truthful auction must just offer a 
price t; to bidder i who accepts if her value is at least ¢;. This is analogous to trying to 
extract a profit R from bidders b and actually getting R in profit when F(b) > R. The 
RSPE auction can then be viewed as randomly partitioning the bidders into two parts, 
treating one partition of the bids b’ as a single bid with value R’ = F(b’), the other 
partition b” as a single bid with value R” = Fb”), and then running the Vickrey 
auction on these two “bids.” This intuition is crucial for the proof that follows as it 
implies that the profit of RSPE is the minimum of R’ and R”. 


Theorem 13.42 The competitive ratio of RSPE is 4. 


PROOF As we discussed above, the profit of RSPE is min(R’, R”). Thus, we 
just need to analyze E[min(R’, R”)]. 

Assume that Fb) = kp has with k > 2 winners at price p. Of the k winners 
in F, let k’ be the number of them that are in b’ and k” the number that are 
in b”. Since there are k’ bidders in b’ at price p, R’ > k’p. Likewise, R” > k’ p. 
Thus, 

E[RSPE(b)] _ E[min(R’, R”)] a: E[min(k’p,k"p)] _ Elmin(k’, k”)] sl 
FOb) kp ce kp 7 k — 4° 
The last inequality follows from the fact that if k > 2 fair coins (correspond- 
ing to placing the winning bidders into either b’ or b”) are flipped then 
E[min{#heads, #tails}] > k/4. 

It is evident that RSPE is no better than 4-competitive via an identical proof to 

that of the analogous result for RSOP. 


The currently best known competitive auction, which has a competitive ratio of 3.25, 
is based on generalizing the idea of RSPE: First, the bids are randomly partitioned into 
three parts, instead of two, with each part being treated as a single bid with value equal 
to its optimal single price revenue. Then the optimal 3-bidder auction is run on these 
three “bids.” 

The random partitioning and profit extraction approach is fairly general. For it to 
work successfully, it needs to be shown that a profit extractor for the benchmark exists, 
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and that up to constant factors, the benchmark is preserved on a random sample of the 
agents. Notice that the consistency issue discussed in earlier sections is not relevant if 
only the agents in one partition win. This approach has been applied successfully to 
several other settings. 


13.4.6 Consensus Estimation and Truthfulness with High Probability 


We now look at an alternative reduction to the decision problem and an approach to 
competitive auctions that does not use random sampling. This approach leads to a 
truthful digital goods auction that is 3.39-competitive with F@. However, rather than 
presenting that result, we present a more general version of the approach with wider 
applicability. To achieve this greater level of generality, we will need to relax our 
solution concept and talk about truthfulness with high probability. 


Definition 13.43 A randomized mechanism is truthful with high probability, 
say | — €, if and only if for all i, v;, bj, and b_;, the probability that agent i 
benefits by bidding nontruthfully is at most €, where the probability is taken 
over the coin flips of the mechanism. In other words, for all i, v;, b;, and b_;, 
Pr[u;(v;, b_;) = uj(b;, b_;)] = 1 —e. 


The techniques presented in this section, when applied to the digital goods auction, 
result in a mechanism that is truthful with probability 1 — O(1/m) where m is the 
number of winners in F™. Thus, as the input instance grows and there are more winners, 
the probability that nontruthful reporting by the agents is beneficial approaches zero. 

Let us first describe the general idea. Consider attempting to design an auction to 
compete with profit benchmark G. Suppose that there exists a profit extractor for G, 
ProfitExtractg r, which obtains profit R from b if R < G(b). Then the mechanism we 
would like to run is the following: 


(i) Compute R = G(b). 
(ii) Run ProfitExtractg rp on b. 


This fails of course because, generally, the R computed in Step (i) is a function of an 
agent’s bid and therefore the agent could misreport their bid to obtain an R value that 
results in a more favorable outcome for them in Step (ii). 

On the other hand, it is often the case that a single agent only contributes a small 
fraction to the profit G(b). In particular, suppose that there is some p such that for all 
i, G(b_;) € [G(b)/, G(b)]. In this case G(b_;) is a pretty good estimate of G(b). The 
idea then is to replace Step (i) above with 


(i)’ Compute R = r(G(b)). 


where the probabilistic function r(-) is a p-consensus B-estimate: 


Definition 13.44 A (randomized) function r(-) is a e-consensus if for all V > 0 
with high probability all V’ € [V/p, V] satisfy r(V’) = r(V). 
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Intuitively, if r(-) is a p-consensus then with high probability r(G(b)) = r(G(b_;)) for 
all i. This will imply that bidder i has very small probability of being able to influence 
the value of r(G(b)) and thus we will be able to guarantee truthfulness with high 
probability. 


Definition 13.45 A (randomized) function r(-) is a 6-estimate if for all V > 0 
it satisfies r(V) < V and E[r(V)] => V/B. 


Intuitively, if r(-) is a B-estimate, then r(G(b)) is close to, but less than, G(b). If this is 
the case, then running ProfitExtractg r on b, with R = r(G(b)), will extract a revenue 
R which is close to G(b). 

Of course, even in Step (i)’, R is a function of all the bids, so the resulting auction 
is not truthful. However, under some mild assumptions? it is possible to show that in 
the case that r(G(b)) is a consensus no bidder has an incentive to deviate and misreport 
their valuation. The resulting mechanism is truthful with high probability. 

We now show how to construct the function r(-). 


Definition 13.46 (7,) Givena > 1, the randomized function r,(-) picks U uni- 
formly from [0, 1] and is 


Ty (V) = “V rounded down to the nearest ait¥ for integer i.” 
Straightforward probabilistic analysis can be used to prove the following lemmas. 


Lemma 13.47 _r, is a p-consensus with probability 1 — log, p. 


Lemma 13.48 r, is a B-estimate with B = on 

In the most general setting of single parameter agents, given the existence of a profit 
extractor for a benchmark G, these lemmas can be combined with the consensus estimate 
profit extraction auction (CEPE) described above, to give the following theorem (see 


Exercise 13.11). 


Theorem 13.49 = Given a monotone profit benchmark G for a single-parameter 
agent problem specified by cost function c(-) and a monotone profit extractor 
ProfitExtractg pr, CEPE is ane competitive and truthful with probability 1 — 


log, p on inputs b satisfing G(b_;) € [G(b)/p, G(b)]. 


13.5 Frugality 


We now turn to aradically different class of problems, in which the auctioneer is a buyer 
intent on hiring a team of agents to perform a complex task. In this model, each agent i 


2 What we need here is that the price offered to bidder i by ProfitExtractg rz is monotone in R, that G(b) is 
monotone in b, and that r(V) is monotone in V. See Exercise 13.11. 
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can perform a simple task at some cost —v; known only to himself. Based on the agents’ 
bids b;, the auctioneer must select a feasible set — a set of agents whose combined 
skills are sufficient to perform the complex task (x; = 1 if agent i is selected) — 
and pay each selected agent some amount — p; (this is negative because we previously 
defined p; as a transfer from the agent to the auctioneer). The setting is thus defined 
by the set system of feasible sets (E, S), where FE represents the set of agents and S 
represents the collection of feasible subsets of E. In terms of our single parameter 
framework, we have c(x) = 0 if {i | x; = 1} € S, and oo otherwise. Several special 
cases have received a great deal of attention. 


Example 13.50 (path auctions) Here the agents own edges of a known directed 
graph (i.e., E is the set of edges) and the auctioneer wishes to purchase a path 
between two given nodes s and tf (i.e., S is the set of all s-t paths). 


Example 13.51 (spanning tree auctions) Here the agents own edges of a 
known connected, undirected graph, so again E is the set of edges, and the 
auctioneer wishes to purchase a spanning tree. 


Whereas when the auctioneer was a seller, our goal was to design a mechanism 
to maximize his profit, here our goal is to design a mechanism to minimize the 
payments the auctioneer makes, ie., to hire the team of agents as cheaply as pos- 
sible. Hence, analyzing the frugality of a mechanism — the amount by which it 
overpays — becomes an important aspect of mechanism design, analogous to profit 
maximization. We study frugality here using worst-case competitive analysis, as in 
Section 13.4. 

A first observation is that here, unlike the digital goods auctions we focused on in the 
previous sections, the auctioneer is interested only in a single “object,” a feasible set. 
Thus, at a very high level, these problems are closest in spirit to the single item auction 
that we discussed in the context of profit maximization. For single-item auctions, in the 
absence of any prior information about agent’s valuations, it is possible to show that 
the Vickrey auction is optimal, and, of course, achieves a profit equal to the value of the 
second highest bidder. Thus, a natural first mechanism to consider for hiring-a-team 
auctions is the VCG mechanism. 

Consider a path auction where the graph consists of n parallel edges from s to f. 
This corresponds exactly to the case where the auctioneer is buying a single item, 
and the Vickrey mechanism will result in a payment equal to the cost of the second 
cheapest edge. Compare this to what happens in a graph consisting of two vertex 
disjoint s-t paths P and P’, each with n edges. Suppose that each edge on path 
P has cost zero, and each edge on path P’ has cost one, so that the total cost of 
path P is zero and of path P’ is n. Then the VCG mechanism will purchase path 
P, and each edge on that path will be paid n, for a total auctioneer payment of 
n>. Thus, here the VCG mechanism pays much more than the cost of the second 
cheapest path. Can we do better? How, in general, does the optimal truthful mecha- 
nism (in terms of competitive ratio) depend on the combinatorial structure of the set 
system? 
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13.5.1 Competitive Framework 


As with our worst-case bounds from the previous section, the first issue that must 
be addressed to study frugality is the competitive framework and in particular the 
benchmark for comparison, which in this case is a cost benchmark. 

We would like the frugality ratio to capture the overpayment of a mechanism with 
respect to a “natural” lower bound. One natural choice for this lower bound is the 
minimum payment by a nontruthful mechanism, in which case, the frugality ratio 
would characterize the cost of insisting on truthfulness. 

Consider the mechanism NV which, given the bids b, selects the cheapest feasible 
set with respect to these bids, and pays each winning agent his bid (ties are broken in 
favor of the efficient allocation). This mechanism is a pay-your-bid auction and is not 
truthful. However, it does have at least one (full information) pure Nash equilibrium, 
i.e., a bid vector b such that, for each agent i, given the bids b_; by all other agents, 
i maximizes his profit by bidding b;. A Nash equilibrium can be considered a natural 
outcome of the mechanism NV, and the resulting net payments are thus a good cost 
benchmark. As we are interested in a lower bound, we define the cheapest Nash value 
Nv) to be the minimum payments by NV over all of its Nash equilibria. 

To illustrate this definition, consider the case of an s-t path auction in which there 
are k parallel paths, as in our k = 2 path example above. Then, \/(v) is precisely the 
cost of the second-cheapest path — the agents on the cheapest path will raise their bids 
until the sum of their bids equals the cost of the second-cheapest path, at which point 
they can no longer raise their bids. None of the other edges have incentive to raise 
their bids (as they are losing either way), nor to lower their bids, as they would incur a 
negative profit. Thus, the metric in this case makes perfect sense — it is the cost of the 
second cheapest solution disjoint from the actual cheapest. 

With a cost benchmark in hand, we can now formalize a competitive framework for 
these problems. 


Definition 13.52 The frugality ratio of truthful mechanism M for buying a 
feasible set in set system (E, F) is 


M(v) 
NW)’ 


where M(v) denotes the total payments of / when the actual private values are 
v, and V(v) is the cost benchmark, the cheapest Nash value with respect to the 
true values v. 


13.5.1.1 Bounds on the Frugality Ratio 


The example we saw earlier shows that the VCG mechanism does not, in general, have 
small frugality ratio. There is, however, one class of set systems for which VCG is 


3 Here we consider only Nash egilibria where nonwinners bid their true value, and ties are broken according to 
efficiency. We refer the reader to the relevant references for a justification of this restriction. 
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known to have optimal frugality ratio equal to 1, and is given in the following theorem 
(see Exercise 13.12). 


Theorem 13.53, VCG has frugality ratio one if and only if the feasible sets of 
the set system are the bases of a matroid. 


On the other hand, for path auctions, say when there are two parallel paths, each 
consisting of many agents, VCG can have frugality ratio Q(n). The following lower 
bound shows that this bad case is not unique to the VCG mechanism. 


Theorem 13.54 Consider the path auction problem on a graph G consisting 
of two vertex disjoint s-t paths, P and P', where |P| =n, (|P| is the number of 
edges on the path P), and |P'| =n’. Then any truthful mechanism for buying a 
path in this graph has frugality ratio at least Q(/nn’). 


PROOF Define v‘”- to be the vector of private values for agents in P, in which 
edge i on P has cost 1/,/n (so its value is v; = —1/,/n), and all the rest of the 
edges in P have cost zero. Similarly, let v‘? be the vector of private values for 
agents in P’ in which edge j on P has cost 1/+/n’ and all the rest of the edges 
have cost zero. Let M be an arbitrary deterministic truthful path auction applied 
to this graph. Define a bipartite graph G’ with a node for each edge in G and 
directed edges defined as follows: there is an edge from node i (corresponding to 
edge i in P) to node j (corresponding to edge j in P’) (respectively an edge from 
j to i), if when running M on bid vector (v°?-?), v'?') path P’ wins (resp. P 
wins). 

Since there are nn’ directed edges in this graph, there must be either a node 
i in P with at least n'/2 outgoing edges or a node j in P’ with at least n/2 
outgoing edges. In the former case, observe that, by the monotonicity of any 
truthful mechanism, P’ must still win even if all edges in P’ bid 0, and the 
payments to each of the relevant edges equal their threshold bid which is at least 
1/./n'. Thus the total payments are at least /n’/2. Since in this case the cheapest 
Nash equilibrium is 1/,/n, we obtain the desired lower bound. The analysis for 
the second case proceeds mutatis mutandis. 


The previous lower bound can be generalized to randomized mechanisms. An im- 
mediate corollary of this lower bound is that any truthful mechanism has frugality ratio 
n on a graph consisting of two vertex disjoint paths of length n. Thus, for this graph, 
VCG achieves the optimal frugality ratio. 

On the other hand, if n’ = 1, the above lower bound on the frugality ratio of any 
mechanism is ,/n. However, for the case of two parallel paths, one of length 1 and one 
of length n, VCG has a frugality ratio of n — the worst case is when the long path wins. 
This raises the question of whether or not there is a better truthful mechanism for this 
graph. 

The answer to this question is “yes.” The principle is fairly simple: if a large set 
is chosen as the winner, each of its elements will have to be paid a certain amount 
(depending on the other agent’s bids). Hence to avoid overpayment, a mechanism 
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should — within reason — give preference to smaller sets. Thus, rather than choosing the 
cheapest feasible set (i.e., the social welfare maximizing allocation), one could consider 
weighting the cost of feasible sets by weights that capture the relative sizes of those 
sets compared to other sets. To obtain a near-optimal mechanism for path auctions, 
the precise magnitude of these weights should be chosen to balance the worst-case 
frugality ratio over all potential combinations of winning sets. 

To illustrate this, let us return to the graph consisting of two vertex disjoint paths. 
We can balance the worst-case frugality ratio by choosing the path that minimizes 
/|P\c(P), where c(P) is the cost of the path P, i.e., c(P) = — ¥°,-p v;. Notice that 
this mechanism uses a monotone allocation rule and hence is truthful. In this case, if 
the two paths are P and P’, and, say P is chosen, the payments to each edge on P 
will be upper bounded by re This is because the threshold bid, and hence the 
payment, to an edge e on P is the largest value they could bid and still win. Thus, the 
total payments are 


VIP Tc(P’) ' 
P|-—— _ < y|P||P'|c(P’. 
|P| viPil = |P||P'\c(P’) 
Since c(P’) is a lower bound on the cheapest Nash of NV, the ratio of payments to 
cheapest Nash is upper bounded by ./|PI|[P’]. The same bound holds when P’ is 
the selected path, resulting in a frugality ratio matching the lower bound to within a 
constant factor. 

These ideas can be generalized to get a mechanism whose frugality ratio is within a 
constant factor of optimal, for any path auction problem, as well as some other classes 
of “hiring-a-team” problems. For most set systems, however, the design of a truthful 
mechanism with optimal or near-optimal frugality ratio is open. 


13.6 Conclusions and Other Research Directions 


In this chapter, we have surveyed the primary techniques currently available for design- 
ing profit-maximizing (or cost-minimizing) auctions in single-parameter settings. Even 
in the single-parameter setting, finding mechanisms with optimal competitive ratio (for 
selling problems) or optimal frugality ratio (for buying problems) is challenging and 
largely open. The situation is much worse once we get to multiparameter problems 
such as various types of combinatorial auctions. In these settings, numerous new chal- 
lenges arise. For example, we do not have a nice, clean, simple characterization of 
truthfulness. Another issue is that it is completely unclear what profit benchmarks are 
appropriate. 

In the rest of this section, we briefly survey a number of other interesting research 
directions. 


Profit Benchmarks. In our discussions of competitive mechanisms, we saw that the 
profit benchmark of a mechanism was a crucial component of the competitive approach 
to optimal mechanism design. This raises a fundamental issue (that has yet to be 
adequately resolved even in simple settings): what makes a profit benchmark the 
“right” one? 
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Pricing. In this chapter, we have discussed private-value mechanism design for profit 
maximization. However, even the public value versions of some of these problems, 
which are essentially algorithmic pricing problems, are open. 

Consider, for example, the problem of pricing links in a network. We are given a 
graph, and a set of consumer valuations. Each valuation is given as a triple (5;, t;, v;), 
indicating that consumer i wishes to traverse a path from s; to ¢; and his value for 
traversing this path (i.e., the maximum price he is willing to pay) is v;. With no 
restriction on pricing, the profit-maximizing solution to the public value problem is 
trivial: charge each consumer his value. However, such a pricing scheme is unreasonable 
for many reasons, the foremost of which is that this pricing scheme is highly unfair — 
different customers can get exactly the same product at different prices. An alternative 
pricing question is the following: define a set of prices for the edges in the graph 
(think of them as tolls) so as to maximize the total revenue collected. The model is 
that, for each consumer /, the network will collect the cost of the cheapest path from 
s; to t; with respect to the edge prices set, if that cost is at most v;. This is just one 
example of an interesting algorithmic pricing problem that has recently received some 
attention. The vast majority of interesting combinatorial pricing problems are not well 
understood. 


Derandomization. As we have seen, randomization is a very important tool in 
the design of competitive auctions. For example, randomization was used in digi- 
tal goods auctions to skirt around impossibility results for deterministic symmetric 
auctions. Recently, however, deterministic constant competitive asymmetric digital 
goods auctions have been discovered. It is an interesting direction for future research 
to understand the general conditions under which one can derandomize competitive 
auctions, or design deterministic auctions from scratch. Unfortunately, standard al- 
gorithmic derandomization techniques do not work in truthful mechanism design 
because running the mechanism with the many possible outcomes of a randomized 
decision making procedure is no longer truthful. Thus, significant new ideas are 
required. 


Fairness. We have focused our attention here on a single goal: profit maximization. 
In some situations, we desire that the mechanisms we design have other properties. 
For example, the randomized digital goods auctions that we have seen are not terribly 
fair — when we run, say, RSOP, some bidders pay a higher price than other bidders, 
and some bidders lose even though their value is higher than the price paid by other 
winning bidders. We say that outcomes of this type are not envy-free. (An auction is 
envy-free if after the auction is run, no bidder would be happier with someone else’s 
outcome.) 

It turns out that it is not possible to design a truthful, constant-competitive digital 
goods auction that is envy-free. Thus, alternative approaches have been explored for 
getting around this impossibility, including relaxing the solution concept to truthfulness 
with high probability, or allowing the mechanism to have a very small probability of 
producing a non-envy-free outcome. 

More generally, designing auctions that both achieve high profit and are, in some 
sense, fair is a wide open direction for future research. 
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Collusion. All of the results presented in this chapter assume no collusion between the 
agents and indeed do not work properly in the presence of collusion. What can be done 
in the presence of collusion? For example, for digital goods auctions, it has been shown 
that it is not possible to design a truthful mechanism that is both profit-maximizing and 
collusion-resistant. However, using the approach of consensus estimates, it is possible 
to get around this impossibility with a mechanism that is truthful with high probability. 


Bounded communication. How do we design revenue maximizing mechanisms when 
the amount of communication between the agents and the auctioneer is severely re- 
stricted? Bounded communication is particularly relevant in settings such as allocation 
of low-level resources in computer systems, where the overhead of implementing an 
auction will by necessity be severely restricted. Most of the work on this topic so far 
has focused on the trade-off between communication and efficiency. These results, of 
course, have implications for revenue maximization in a Bayesian setting due to the 
reduction from revenue maximization to surplus maximization via virtual valuations. 


Bundling. Another interesting direction is bundling. It has been proved that in several 
settings, bundling items together may increase the revenue of the mechanism. However, 
the limits of this approach are not understood. 


Repeated and online Games. Profit maximization (or cost minimization) in mecha- 
nism design arises in many settings, including resource allocation, routing and conges- 
tion control, and electronic commerce. In virtually every important practical application 
of mechanism design, the participants are dynamic. They arrive and depart over time, 
with decisions being made on an ongoing basis. Moreover, in many important appli- 
cations, the same “game” is played over and over again. Our understanding of online, 
repeated games from the perspective of profit maximization is limited. For example, 
sponsored search auctions, discussed in Chapter 28, lead to many interesting open 
questions of this type. 


Alternative solution concepts. Although truthfulness is not a goal in and of itself 
when the goal is profit maximization, it is a strong and appealing concept: First, truthful 
mechanisms obviate the need for agents to perform complex strategic calculations or 
gather data about their competitors. Second, in some cases, especially single-parameter 
problems, they simplify the design and analysis of protocols. Third, there is no loss of 
generality in restricting ourselves to truthful mechanisms if our plan is to implement a 
mechanism with dominant strategies (by the revelation principle). Fourth, in a number 
of settings, the revenue extracted by the natural truthful mechanism is the same as that 
extracted by natural nontruthful mechanisms (by the revenue equivalence theorem). A 
related point is that there are often natural and appealing variants of truthful mechanisms 
that achieve the same outcome (e.g., an English auction instead of a second-price 
auction). Finally, and this is important, if we do not understand the incentive structure 
of a problem in a truthful setting, we are going to be very hard-pressed to understand 
it in any other setting. 

Having said all that, truthful mechanism design also has a number of significant 
drawbacks. For one thing, people often do not feel that it is safe to reveal their 
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information to an auctioneer. An interesting alternative is to use an ascending auc- 
tion, where published prices can only rise over time, or an iterative auction, where the 
auction protocol repeatedly queries the different bidders, aiming to adaptively elicit 
enough information about the bidders’ preferences to be able to find an optimal or 
near-optimal outcome. What is the power of ascending and iterative auctions when the 
auctioneer’s goal is profit maximization? 

Truthfulness may also needlessly limit our ability to achieve our goals. This is mani- 
fested in terms of extreme limitations on the mechanism, exceedingly high competitive 
ratios, or simply impossibility. In the repeated game setting, these issues are much more 
severe. Thus, one of the most important directions for future research is to consider 
alternative solution concepts. 

It has been shown that taking a small step away from truthfulness, e.g., to truth- 
fulness with high probability, can enable us to overcome some impossibility results. 
Other solution concepts that have received consideration in the literature include Nash 
equilibria, correlated equilibria, and extensions of these. However, very little work 
has been done concerning the design of profit-maximizing mechanisms using these 
solution concepts. 

In summary, major directions for future research are to figure out the correct solution 
concepts for use in profit-maximizing auction design, and to develop techniques for 
designing profit-maximizing mechanisms with respect to these concepts, especially 
in online and repeated settings. The key desiderata of an equilibrium or solution 
concept are that (a) there exist mechanisms that in this equilibrium achieve or at least 
approximate our profit maximization goals (and whatever other goals we may have) 
and (b) there are simple, rational, i.e., utility-maximizing, strategies for the players that 
lead to outcomes in this equilibrium.* 


13.7 Notes 


Profit maximization in mechanism design has an extensive history beginning, pri- 
marily, with the seminal paper of Myerson (1981) and similar results by Riley and 
Samuelson (1981). These papers study Bayesian optimal mechanism design in the less 
restrictive setting of Bayes-Nash equilibrium. However, Myerson’s optimal mechanism 
is precisely the optimal truthful mechanism we present here. This material is by now 
standard and can be found in basic texts on auction theory (Krishna, 2002; Klemperer, 
1999). 

The material on approximately optimal mechanism design, including the empir- 
ical Myerson mechanism and the random sampling optimal price auction comes 
from Baliga and Vohra (2003), Segal (2003), and Goldberg et al. (2006). Precise anal- 
ysis of convergence rates for unlimited supply auction settings is given in Balcan et al. 
(2005). 

The worst-case competitive approach to profit maximization, the proof that no sym- 
metric, deterministic auction is competitive and the RSOP auction were first introduced 
in Goldberg et al. (1999), Goldberg et al. (2001), and Goldberg et al. (2006). The proof 


4 Alternatively, we can ask that there are simple and reasonable behaviors that the players can follow that lead to 
outcomes in equilibrium and that the complexity of figuring out how to deviate advantageously is excessive. 


358 PROFIT MAXIMIZATION IN MECHANISM DESIGN 


of Theorem 13.36 can be found in Feige et al. (2005). The lower bound on the compet- 
itive ratio for digital goods auctions is taken from Goldberg et al. (2004). The notion of 
profit extraction, truthful mechanisms for reducing profit maximization to profit extrac- 
tion, and the RSPE auction come from Fiat et al. (2002), Deshmukh et al. (2002), and 
Goldberg and Hartline (2003). The material on cost sharing that is the basis for many 
of the known profit extractors can be found in Moulin and Shenker (2001). The idea of 
consensus estimation and truthfulness with high probability come from Goldberg and 
Hartline (2003), Goldberg and Hartline (2003). Refinements and extensions of these 
results can be found in Goldberg and Hartline (2005) and Deshmukh et al. (2002). 
The material on frugality and path auctions is drawn from Archer and Tardos (2002), 
Elkind et al. (2004), and Karlin et al. (2005). 

This survey focused primarily on auctions for digital goods. Further results on 
profit maximization (and cost minimization) in these and other settings can be found 
in Goldberg and Hartline (2001), Deshmukh et al. (2002), Fiat et al. (2002), Talwar 
(2003), Garg et al. (2002), Czumaj and Ronen (2004), Ronen and Tallisman (2005), 
Balcan et al. (2005), Borgs et al. (2005), Hartline and McGrew (2005), Immorlica et al. 
(2005), Aggarwal and Hartline (2006), and Abrams (2006). 

The research issues surveyed in the conclusions of this chapter are explored in 
a number of papers. Profit benchmarks are discussed in Goldberg et al. (2006), 
Deshmukh et al. (2002), Hartline and McGrew (2005), and Karlin et al. (2005); al- 
gorithmic pricing problems in Guruswami et al. (2005), Hartline and Koltrun (2005), 
Demaine et al. (2006), Briest and Krysta (2006), Balcan and Blum (2006), and Glynn 
et al. (2006); derandomization of digital goods auctions via asymmetry in Aggarwal 
et al. (2005); fairness in Goldberg and Hartline (2003a); collusion in Schummer (1980) 
and Goldberg and Hartline (2005); bounded communication in Blumrosen and Nisan 
(2002) and Blumrosen et al. (in press); and bundling in Palfrey (1983) and Jehiel 
et al. (in press). Studies of profit maximization in online auctions can be found in 
Bar-Yossef et al. (2002), Lavi and Nisan (2000), Blum et al. (2004), Kleinberg and 
Leighton (2003), Hajiaghayi et al. (2004), and Blum and Hartline (2005). Truthfulness 
with high probability was studied in Archer et al. (2003) and Goldberg and Hartline 
(2003a, 2005). Alternative solution concepts are explored in Osborne and Rubinstein 
(1994), Lavi and Nisan (2005), and Immorlica et al. (2005), among others. 
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Exercises 


13.1 What is the optimal Bayesian single-item auction when the seller values the item 
at vo > O and bidder valuations are i.i.d? 


13.2 What is the optimal Bayesian auction for a seller with k identical items and n > k 
bidders with i.i.d. valuations drawn uniformly from [0, 1]? 

13.3. Consider a discrete setting where bidder i’s probability of having valuation u;; is 
f,;. Derive the virtual valuations in this setting. 

13.4 Show that the empirical Myerson mechanism, EM, applied to a single-item auc- 
tion problem is identically the Vickrey auction. 
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13.5 The McDiarmid inequality is the following. Let 
Y1,.-+, Yn be independent random variables taking on values from a set A and 
t : A” > Ra function satisfying 


sup |t(y) — (yj, yr) < G 
yeAyeA 


for all i. Then for all y > O we have: 
Pr[|t(Vi,.... Yn) —Elt(Y1,.... Yl | > vy] <2” 29, 


Prove Lemma 13.29 using the McDiarmid inequality. 


13.6 Given a set of prices Q and bids b we say O’ Cc Q isa y-cover of QO on b if for all 
q € Q there exists q’ in Q’ such that 


Y— |q(bi) — q'(b)| < vy OPTo(b). 


(a) Prove that if Q’ is a y-cover of Q and all q’ € Q’ are e-good then all q € Q are 
(€ + y)-good. 

(b) Show that RSOPg on input b such that Q’ is a 5-cover of Q is a (1 —€ — y)- 
approximation with probability (1 — 6) when OPTo(b) > a In(221), 

(c) For any b with 6; € [1, Al, find a y-cover of Q = R of size ele log log fn). 


13.7 Give a deterministic asymmetric auction that is a 2-approximation to the optimal 
single price sale, OPT;1,4;(b), when b satisfies 6; € {1, h} for all i and at least two 
bids have value h. 


13.8 Prove that no truthful digital goods auction with 2 bidders is best. In other words, 
show that for any truthful auction A, there is another auction A’ and an input v 
such that the profit of A’ on input v is higher than that of A. 


13.9 Show how to use a f-competitive digital goods auction (against benchmark 
F)(v)) to obtain a B-competitive auction for the limited supply setting where only 
k identical units are available for sale (use benchmark F?-(v) = maxz<<k i V4). 
13.10 Prove the correctness of ProfitExtractg (Definition 13.40): prove that it is truthful 
and that it always obtains a profit of R when F(b) > R. 
13.11. Given a monotone profit benchmark, G; a profit extractor ProfitExtractg,z for G 
that is monotone in R; and a monotone function r(V); consider the mechanism 
that (a) computes R = r(G(b)), and (b) runs ProfitExtractg, r(b). 


(a) Prove that if r(G(v_;)) = r(G(v)) for particular bidder valuations v that bidding 
b; = vy; is an ex-post-equilibrium, i-e., if b_; = v_;, then an optimal response 
for bidder / is to bid b; = vj. 

(b) Prove Theorem 13.49. 


13.12 Prove that the VCG mechanism has frugality ratio one for spanning tree auctions. 


CHAPTER 14 


Distributed Algorithmic 
Mechanism Design 


Joan Feigenbaum, Michael Schapira, and Scott Shenker 


Abstract 


Most discussions of algorithmic mechanism design (AMD) presume the existence of a trusted center 
that implements the required economic mechanisms. This chapter focuses on mechanism-design 
problems that are inherently distributed, i.e., those in which such a trusted center cannot be used. 
Such problems require that the AMD paradigm be generalized to distributed algorithmic mechanism 
design (DAMD). 

We begin this chapter by exploring the reasons that DAMD is needed and why it requires different 
notions of economic equilibrium and computational complexity than centralized AMD. We then 
consider two DAMD problems, namely distributed VCG computation and multicast cost sharing, that 
illustrate the concepts of ex-post Nash equilibrium and network complexity, respectively. 

The archetypal example of a DAMD challenge is interdomain routing, which we treat in detail. We 
show that, under certain realistic and general assumptions, one can achieve incentive compatibility 
in a collusion-proof ex-post Nash equilibrium without payments, simply by executing the Border 
Gateway Protocol (BGP), which is the standard for interdomain routing in today’s Internet. 


14.1 Introduction 


To motivate the material in this chapter, we begin with a review of why game theory is 
relevant to computer science. As noted in the Preface to this book, computer science 
has traditionally assumed the existence of a central planner who dictates the algorithms 
used by computational nodes. While most nodes are assumed to be obedient, some 
nodes may malfunction or be subverted by attackers; such byzantine nodes may act 
arbitrarily. 

This book’s founding premise, in fact its raison d’étre, is that there are many 
computational contexts in which there is no central (or cooperative) authority that 
controls the computational nodes. In particular, the Internet has changed computation 
from a largely local endeavor to one that frequently involves diverse collections of 
individuals (or machines acting on their behalf). For example, Web services, peer-to- 
peer systems, and even the interaction among packets on a wire are all cases in which 
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individuals with no ties to each other, except perhaps a common interest in a document 
or simultaneous use of a link, find themselves interacting over the Internet. 

In such cases, it is often best to treat the computational entities as independent and 
selfish agents, interested only in optimizing their own outcome. As a category of be- 
havior, selfishness lies between the extremes of automatic obedience and byzantine dis- 
ruption; selfish agents are unwilling to follow a central planner’s instructions, but they 
do not act arbitrarily. Instead, their actions are driven by incentives, i.e., the prospect 
of good or bad outcomes. The field of mechanism design, described in Chapter 9, 
has shown how, by carefully constructing economic mechanisms to provide the proper 
incentives, one can use selfish behavior to guide the system toward a socially desir- 
able outcome.' This book is devoted to exploring the interaction of incentives and 
computing, a topic that has come to be known as Algorithmic Mechanism Design 
(AMD). 

Substituting a decentralized set of incentives for a central planner is a radical de- 
parture from traditional algorithm design. However, most work in this new field of 
AMD assumes the presence of a central computational facility that performs the cal- 
culations required by the economic mechanism. In auctions, for example, the agents 
each have independent goals and desires, but the computation to determine winners 
and payments is done by the auctioneer, and the hardness of the computation is eval- 
uated using traditional notions of complexity (see, e.g., Chapters 1, 9, 11, and 12). 
As such, AMD considers novel incentive-related algorithm design but uses a standard 
centralized model of algorithm execution. 

This combination of decentralized incentives but centralized computation applies in 
a wide variety of settings, many of which have been described elsewhere in this book. 
This approach requires transmitting all the relevant information to a single, trusted 
entity (hereafter called the trusted center), which is feasible if (i) such a trusted center 
exists, and (ii) the communication required to transmit the information and the resulting 
computational burden on the trusted center are both manageable. However, if either of 
these two assumptions fails, then a more decentralized approach must be considered. 

As we discuss in more detail in Section 14.3 of this chapter, the problem of inter- 
domain routing is one in which a decentralized approach is valuable. The Internet is a 
collection of smaller networks, called Autonomous Systems (ASes), that are stitched 
together by the interdomain-routing system to form the fully connected Internet. The 
interdomain-routing system therefore plays a crucial role in the functioning, even the 
existence, of the Internet. However, any approach to interdomain routing must address 
the challenges of trust, scalability, and reliability. The ASes are competing economic 
entities who want to optimize the routing outcome achieved and minimize the private 
information revealed; accordingly, they not only act selfishly but are also unwilling to 
share private information with, or cede control to, any trusted center. Thus, the ASes 
must distribute the route computation among themselves. 

Even if trust were not an issue, scalability would drive the system toward distributed 
route computation. Centralizing the route computation would involve transmitting the 
entire AS graph to a central location and updating it whenever the graph changed. 


' This desired outcome is often defined as the optimum of some global objective function, but a wide variety of 
social standards can also be used. 
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Given the considerable size and volatility of the AS graph, such a centralized route 
computation would be infeasible. 

Similarly, the need for reliability, so crucial in the Internet, tends to favor decen- 
tralized designs. In a centralized design, the trusted center becomes a single point of 
failure; the fate of the entire network rests on this single system that could fail or 
be subverted. As an example of how scalability and reliability can drive the need for 
decentralization, we note that current intradomain-routing algorithms, which do not 
span more than one AS and so are designed with the assumption of mutual trust among 
routers, are almost all distributed. 

Thus, there is a need to decentralize not only incentives but also computation; this 
leads to Distributed Algorithmic Mechanism Design (DAMD), which is the central 
focus of this chapter. DAMD has the same dual concerns, incentive compatibility, and 
computational complexity, as AMD, but it differs in two important respects. 

The first difference involves the nature of complexity. DAMD’s measure of com- 
putational complexity is quite different from AMD’s, because the computation is 
distributed. Any measure of the complexity of a distributed algorithm executed over 
an interconnection network T must consider at least five quantities: the total number 
of messages sent over T, the maximum number of messages sent over any one link 
in 7, the maximum size of a message, the local computational burden at each node, 
and the storage required at each node. If a distributed algorithm requires an excessive 
expenditure of any one of these resources, then its complexity is unacceptable. We will 
use the term network complexity to refer to these, and other, metrics of the difficulty of 
distributed implementation. 

If the interconnection network T is trusted by all the agents and can feasibly serve 
as the trusted center, then the measure of complexity is the main difference between 
AMD and DAMD. However, if the distributed computation is done by the agents, then 
a second difference arises: the strategic nature of the computation itself. In AMD, 
agents can manipulate a game only by their selection of actions among those described 
in the definition of the economic mechanism; they cannot affect the computation 
of the mechanism, because all outcomes are computed (by the trusted center) from 
the vector of strategies, according to the definition of the mechanism. If the agents 
themselves perform the computation using some distributed algorithm, then they have 
more opportunities to manipulate the outcome, e.g., by misrepresenting the results 
of a local computation to a neighboring agent or, more drastically, by simply not 
communicating with that neighboring agent at all, in an attempt to exclude him from the 
game. Our assumption of selfishness requires that we consider all forms of manipulative 
behavior when designing the economic mechanism; in particular, this means that we 
must provide incentives that ensure selfish agents find it in their best interest to perform 
the distributed computation correctly. 

While this chapter discusses the use of incentives to prevent these other forms of 
manipulation, one can also use cryptographic protocols to replace trusted parties in 
mechanism computation. This active area of study is covered in Chapter 8 of this 
volume. 

In the next section of this chapter, we briefly discuss two examples of DAMD. Our 
third section is devoted to an in-depth exploration of the incentive issues in interdomain 
routing. We conclude with open questions and exercises. 
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14.2 Two Examples of DAMD 


As noted above, DAMD differs from AMD in two respects: the additional ways in 
which the agents can influence the outcome (referred to hereafter as “computational 
manipulation”) and the measure of computational complexity (the aforementioned 
“network complexity”). 

Here, we briefly discuss two examples of DAMD that illustrate these issues. The 
first is a distributed implementation of a VCG mechanism (see Chapter 9); we will 
ignore network complexity in this example and focus on how to prevent manipulation 
of the computation. The second example is sharing the cost of a multicast transmission; 
it illustrates the notion of network complexity but, because we assume the presence of 
a trusted computational infrastructure, does not involve computational manipulation. 


14.2.1 Distributed Implementation of VCG 


We now discuss one way a set of agents can jointly implement a VCG mechanism 
without fear of manipulation. We start with a set of outcomes O and a collection 
of agents N, each with his own valuation v; over those outcomes. In our notation, 
6 is an outcome that maximizes the total social welfare of the agents. That is, 6 = 
argmax,cg > jcy Vi(o), W is the maximum total social welfare value, and W_; denotes 
the maximum total social welfare of all agents except the i’th. For convenience, we 
focus on the particular mechanism in which p; = W_; — W + v;(6), where p; is the 
payment by agent i. 

We assume that there is no trusted center; i.e., that the computation of the VCG 
mechanism must be done by the agents themselves. However, we do presume the 
existence of some central enforcer whose responsibility it is to implement the outcome 
0 decided upon by the agents and collect the payments; the enforcer can impose severe 
penalties if the agents do not agree on an outcome. 

To see how a distributed computation can be manipulated, consider a network in 
which the nodes are connected in a ring, and there is exactly one agent at each node. 
Assume that the agents are computing a second-price auction of a single good by 
passing around a message containing the top two bids for that good. If an agent puts 
his bid on top and puts in a very low bid for the second bid, then he can get the good 
more cheaply (as long as these fields are not overwritten by some later agents that have 
higher bids). 

More generally, consider any distributed algorithm A, capable of running over an 
arbitrary number of computational nodes, that takes as input a set of agent valuations 
and produces the maximizing outcome and the payments. As the preceding example 
suggests, if we run A over any subset of NV to compute 6, W, and each W_,, then there 
is the possibility that an agent can manipulate the computation. 

One way to avoid this is replication: Break the agents into two groups, have them 
exchange all their valuations, and then have each group compute its own version of 6 
and the p;. If the two groups agree on the outcomes and payments, then those outcomes 
and payments are adopted; if not, all agents suffer a severe penalty. Here, an agent plays 
different roles in the two versions of the computation: In the first, his role is to help 
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compute the outcome and payments; in the other, his role is to provide his valuation so 
that others may perform this computation. For the first version, an agent 7 could engage 
in arbitrary computational manipulation in an attempt to obtain a more favorable p; or 
choose an outcome he prefers to the socially optimal one; in the second version, all 
he could do is lie about v;. Because the VCG mechanism is strategyproof, the agent 
will reveal truthfully to the other computational group and therefore, to avoid a severe 
penalty for inconsistency, will carry out the computation faithfully. 

Notice that faithful computation is not a dominant strategy. If, for instance, all the 
other agents decide to choose a suboptimal outcome, then agent i is better off going 
along with that choice rather than causing a disagreement (and triggering the severe 
penalty). However, if all the other agents faithfully execute the prescribed algorithm 
A, then agent i is best off doing so as well. Thus, the most natural solution concept 
when considering computational manipulation is not dominant strategies but instead 
ex-post Nash equilibrium, which was defined in Chapter 9. We will expand on this 
point further when we discuss interdomain routing in Section 14.3 below. 

In this example, we have focused on computational manipulation and ignored net- 
work complexity. In our next example, we do the opposite. 


14.2.2 Sharing the Cost of a Multicast Transmission 


Multicast is an Internet packet-transmission mode that delivers a single packet to 
multiple receivers. It is accomplished by setting up a shared delivery tree that spans all 
the receivers; packets sent down this tree are replicated at branch points so that no more 
than one copy of each packet traverses each link. Because it is far more efficient than 
traditional unicast transmission (in which packets are sent only to a single destination), 
multicast is particularly appropriate for distributing popular real-time content, such as 
movies, to a large number of receivers. 

Internet content distribution both provides benefits and incurs cost, which we can 
model as follows. We assume that there are agents, located at various places in the 
network, who would derive some utility from receiving the content and that a cost is 
incurred each time the content is transmitted over a network link. The policy question 
is how these costs and benefits should be distributed; more specifically, which agents 
should receive the content, and how much should each agent pay? 

To define the problem more precisely, we consider a user population P residing at 
a set of network nodes N that are connected by bidirectional network links L. The 
multicast flow emanates from a source node a, € N; given any set of receivers S C P, 
the transmission flows through a multicast tree T(S) C L rooted at a, that spans the 
nodes at which users in S reside. We make the natural assumption that routing is 
monotonic, i.e., that S$; C S> > T(S,) C T(S>). 

Each link / € L has an associated cost c(/) > 0 that is known by the nodes on 
each end, and each user i assigns a utility value u; to receiving the transmission. 
The total cost C(S$) of reaching a set S of receivers is given by C(S) = Vier(s) c(1), 
and the net welfare NW(S) of delivering content to this set of receivers is given by 
NW(S) = ies Ui — C(S). 

A cost-sharing mechanism determines which users receive the multicast transmis- 
sion and how much each receiver is charged. We let p; > 0 denote how much user i 
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is charged and o; denote whether user i receives the transmission; o; = 1 if the user 
receives the multicast transmission, and 0; = 0 otherwise. 

The mechanism M is then a pair of functions M(u) = (o(u), p(u)). It is important 
to note that both the inputs and the outputs of these functions are distributed throughout 
the network; that is, each user inputs his vu; from his network location, and the outputs 
o;(u) and p;(u) must be delivered to him at that location. The practicality of deploying 
the mechanism on the Internet depends on the feasibility of computing the functions 
o(u) and p(u) and distributing the results. 

In our model, it is the agents who are selfish. The routers (represented by tree 
nodes), links, and other network-infrastructural components are obedient. The cost- 
sharing algorithm does not know the individual utilities, and so users could lie about 
them, but once they are reported to the network infrastructure (e.g., by sending them to 
the nearest router), the algorithms for computing o (uw) and p(w) can be reliably executed 
by the network. Thus, our interest here is in network complexity, not computational 
manipulation. 

Given the selfish nature of agents, the mechanism should be strategyproof, 1.e., 
revealing u; truthfully should be a dominant strategy. There are two other desirable 
features one would want in a cost-sharing mechanism: budget balance (the sum of 
the charges p; covers the total cost of transmitting the content) and efficiency (the 
total welfare is maximized). The classic result of Laffont and Green, as reviewed in 
Chapter 9, implies that no strategyproof mechanism with quasilinear utilities can 
achieve both budget balance and efficiency”; we therefore consider two separate mech- 
anisms, one that achieves budget balance and one that achieves efficiency. 

To achieve efficiency, we consider a VCG mechanism called marginal cost (MC). Let 
5 denote the largest set that maximizes N W(S) (this is uniquely defined), and let NW= 
NW(S); similarly, Nw _; 1s the maximum value over all S of NW(S — i). Then the MC 
mechanism chooses the receiver set S$ and sets payments pj = oju; — NW + NW _ 

For budget balance, we choose the Shapley Value (SH) mechanism. The mechanism 
shares the cost of each link equally among all the agents downstream of that link; an 
agent i is downstream of a link / if / € T({i}). To determine which agents receive the 
transmission, we first start with S = P and compute the charges. We then eliminate any 
agent for which the charge exceeds the agent’s utility (1.e., p; > u;) and recursively 
prune the receiver set until all agents within the set have utilities greater than or 
equal to their charge. The cross-monotonic nature of these charges (an agent is never 
charged less after another agent leaves the receiver set) guarantees that the resulting 
set is well defined, independent of the order in which agents are eliminated. To see 
why the ordering does not matter, consider the following. We say that an elimination 
(or pruning) is “legal” if the node to be removed is charged more than its utility; an 
elimination ordering is “legal” if each individual pruning is legal. We note that, if an 
agent i is charged more than his utility when the set S of agents remains, then this 
continues to hold when any subset of S remains (because cross-monotonicity requires 


? More precisely, the Laffont-Green result reviewed in Chapter 9 shows that the only strategyproof, welfare- 
maximizing mechanisms with quasi-linear utilities are the VCG mechanisms, which are known not to be 
budget-balanced. Myerson and Satterthwaite have shown a more general result about the impossibility of 
achieving efficient and budget-balanced allocations with rational agents; see Chapter 9 for details. 
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that i’s charges are at least as great). This means that the concatenation of any two legal 
elimination orderings is also a legal elimination ordering (where we ignore duplicate 
prunings). For example, if (1,5, 7,3) and (7,2, 5,8) are two legal orderings, then 
(7, 2, 5, 8, 1, 3) is also legal, as is (1,5, 7, 3, 2, 8). Thus, if any two subsets S and S’ 
can be arrived at by sequences of legal eliminations, then S 1 S’ can also be arrived at 
by a sequence of legal eliminations. 

It is easy to see that both MC and SH are polynomial-time computable by centralized 
algorithms; so the issue is whether it is hard to implement them in a distributed fashion. 
Certainly any mechanism can be computed by sending all the valuations to a single 
node, doing the computation, and then returning the results to each agent. In the worst 
case, this would require sending (2(|P|) bits over some number of links, which is 
clearly not desirable. It turns out that we cannot do substantially better than this for the 
SH mechanism. 


Theorem 14.1 Any distributed algorithm, deterministic or randomized, that 
computes the SH multicast cost-sharing mechanism must send {2(| P|) bits over 
linearly many links in the worst case. 


By contrast, it is possible to compute MC using only two short messages per link 
and two simple calculations per node. This is done in two phases, the first a bottom-up 
traversal in which welfare values are computed for each subtree of T(P) and the second 
a top-down traversal in which membership bits o; and cost shares p; are computed for 
each i € P. The algorithms are given in Figures 14.1 and 14.2. In these figures, V(P) 
denotes the node set of tree T(P), Ch(q@) the set of children of node a, res(qa) the set 
of users resident at node a, u® the sum of utilities of users in res(a@), c® the cost of the 
link connecting @ to its parent in T(P), and T°(P) the union of the subtree rooted at 
a and the link connecting @ to its parent. 

The reason that this simple two-phase algorithm suffices is that computing the MC 
cost share p; does not require a from-scratch computation of N W_,;. Rather, it is enough 
to compute W® for every node a in V(P) during the computation of N W. Suppose that 


At node a € V(P) 
After receiving a message A? from each child B € Ch(a) 
We —u%+ (Y pechia) A*) — c# 
If W% > Othen 
‘ 
o; < 1 for alli € res(a) 
Send W° to parent(a) 


} 


Else 


{ 


o; < 0 for alli € res(a) 
Send 0 to parent(a) 


Figure 14.1. Bottom-up traversal: Computing welfare values. 
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Initialize: Root a sends W to each of its children. 
For each a € V(P) — {ao} 
After receiving message A from parent(a) 
//Case 1: T“(P) NT(S) = 
//Set o;’s properly at aw and propagate non-membership downward. 
If o; = 0, for all i € res(a), or A < O, then 
{ 
Di <Oando; < 0 for alli € res(a) 
send —1 to 6 for all B € Ch(a) 
} 
//Case 2: T°(P)NT(S) #9. 
//Compute cost shares and propagate minimum welfare value downward. 
Else 
{ 
A <—min(A, W”%) 
For each i € res(@) 
If u; < A, then p; < 0, else pj <u; — A 
For each B € Ch(a) 
Send A to B 


Figure 14.2. Top-down traversal: Computing membership bits and cost shares. 


i € res(B) and that y;(u) is the smallest W% of any node a on the path from 6 to the 
root of T(P). If u; < y;(u), then removing i from the set of potential receivers does not 
change the set of nodes to which the content is delivered. If u; > y;(u), then removing 
i from the set of potential receivers does change the set of nodes, and the resulting 
difference NW — NW_ ; is y;(u). The proofs of these facts are left as an exercise for 
the reader. 


Theorem 14.2. MC cost sharing requires exactly two messages per link. There is 
analgorithm that computes the cost shares by performing one bottom-up traversal 
of T(P), followed by one top-down traversal.? 


More information about AMD for cost sharing can be found in Chapter 15. 


14.3 Interdomain Routing 


We now turn to the problem of interdomain routing. To provide reachability between 
hosts, the various ASes that make up the Internet must be interconnected. However, as 


3 The algorithm is provably optimal with respect to the number of messages sent but is not known to be optimal 
with respect to the maximum size of a message. However, the maximum size of a message is polynomial in 
max; size(c(/)) and max; size(u;) and polylogarithmic in | P| and ||, and the two local computations required 
at each node are fast and space-efficient. 
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we noted earlier, the ASes are economically independent entities (indeed, frequently 
competitors), and there is no trusted center to which they are all accountable that could 
assign interdomain routes. Thus, the ASes themselves must compute the routes in 
a distributed fashion. The route computation scheme must handle three problematic 
aspects of interdomain routing: (i) there is a large number of ASes; (ii) different ASes 
have different criteria for choosing one route over another, and these criteria may 
conflict; and (iii) the collection of ASes and the links between them change frequently. 
All of these factors make DAMD a highly suitable approach to interdomain routing. 

We can formally define the interdomain-routing problem as follows. The network 
topology is defined in terms of the AS graph G = (N, L), where each node in N = 
{1,...,} corresponds to an AS in the Internet, and each link in L corresponds to 
a direct connection between a pair of neighboring ASes. Because routing protocols 
typically compute routes for each destination independently, we can choose a particular 
destination AS d and let P' be the set of all loop-free paths from i to d in G that are 
not removed from consideration.* An interdomain-routing protocol allocates to each 
source node i € N aroute R; € P’. 

We now describe this problem in greater detail, first from the networking perspective 
and then from the mechanism-design perspective. 


14.3.1 Networking Perspective 


From a networking or protocol-design point of view, any wide-area routing protocol 
must fulfill, to some extent, the following requirements: 


¢ For reasons of trust, scale, and robustness, the routing protocol must be distributed, 
carried out by the ASes themselves. 

¢ In order to reduce routing state, the routing protocol must use destination-based for- 
warding; i.e., all routing decisions must be based solely on a packet’s destination. 
Each AS has a single next hop for the destination d, and the resulting route allocation 
Ty = {R,,..., R,} forms a confluent tree to the destination d. 

e The routing protocol should be adaptive, adjusting to the current network topology 
without relying on any a priori topology information. 

¢ The routing protocol should be time-efficient, communication-efficient (in its use of 
communication between the ASes), and space-efficient (in its use of the storage space 
that each individual AS needs in order to participate in the protocol). 


These requirements are satisfied by each of the common routing-protocol designs — 
namely distance-vector, link-state, and path-vector — although these designs differ in 
their space requirements. However, interdomain routing has one additional require- 
ment: 


¢ The routing protocol must produce loop-free routes even while individual ASes make 
autonomous decisions about which routes are preferable. 


4 A path from i to d could be “removed from consideration” because it is filtered by i or one of i’s neighbors or 
because of link or node failures. 
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Figure 14.3. Route computation using a path-vector protocol. 


Of the common routing-protocol designs, only path-vector satisfies this requirement. 
As a result, the current standard protocol for Internet interdomain routing, the Border 
Gateway Protocol (BGP), is a path-vector protocol. To see why path-vector is a suitable 
design choice, we describe BGP in more detail. 

BGP allows adjacent nodes to exchange information through update messages that 
announce newly chosen routes (see illustration in Figure 14.3); a route announcement 
contains the entire path to the destination (the list of ASes in the path). A path- 
vector protocol (like most other routing protocols) computes routes to every destination 
AS independently; so we can focus on routes to a single destination d. The route- 
computation process is initialized when d announces itself to its neighbors by sending 
update messages. The rest of the routing tree to d is built recursively, as knowledge of 
how to reach d propagates through the network via subsequent update messages. We 
assume that the network is asynchronous, meaning that the arrival of update messages 
along selective links can be delayed. 

The routing process at a particular node i has three stages that are iteratively applied: 


(i) Importing routes: Routes to d are received via update messages from its neighbors. 
Node i has an import policy that specifies which of the routes it is willing to consider. 
All such importable routes are stored in an internal routing table. At any given time, 
i’s internal routing table contains the latest importable routes. 

(ii) Route selection: If there is more than one route to d in the routing table (i.e., more than 
one of i’s neighbors has announced an importable route to d), node i must choose one 
(expressing a local preference over routes). 

(iii) Exporting routes: Whenever there is a change to i’s best route, it announces the newly 
selected route to some or all of its neighbors using update messages. Node i has 
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Figure 14.4. When AS 1 prefers route 12d to 1d, and AS 2 prefers route 21d to 2d, BGP (or 
any other path-vector protocol) can oscillate indefinitely. 


an export policy that determines, for each neighbor j, which routes it is willing to 
announce to j at any given time. 


AS autonomy is expressed through the freedom each AS has in choosing its routes, 
its import policy, and its export policy. These choices are based on local policy con- 
siderations and need not be coordinated with any other AS. The inclusion of the entire 
path in route announcements allows ASes to avoid routes with loops even while making 
otherwise arbitrary policy choices. Link-state or distance-vector routing protocols can 
avoid loops only if all ASes use the same criterion to choose routes and thus do not 
support autonomy. 

One design requirement not explicitly listed here is convergence. Clearly the routing 
protocol should eventually enter a stable state in which every node prefers its currently 
chosen route to all others in its routing table, and all routing tables reflect the current 
route choices of its neighbors. Moreover, we would like the protocol to be robust, 
converging for every AS graph obtained by removing any set of nodes and links from 
the original instance. 

Unfortunately, while the path-vector form of routing prevents loops, it does not 
ensure convergence; the routing announcements can enter a persistent oscillatory state. 
Consider the simple example depicted in Figure 14.4. Both nodes 1 and 2 would rather 
send traffic through the other source node than send traffic directly to the destination. 
Let us now simulate the execution of a path-vector protocol in the worst-case scenario: 
The computation is initialized when d announces itself to its two neighbors, nodes 
1 and 2. At this point in time, these direct paths are the only routes available to d. 
Hence, | and 2 will choose the routes 1d and 2d, respectively, and inform each other, 
via update messages, of their selected routes. Upon receipt of these update messages, 
nodes 1 and 2 will change their selected routes to, respectively, 12d and 21d. However, 
now that none of the direct routes is being used, the indirect routes are no longer viable; 
so 1 and 2 are forced to return to their former routes 1d and 2d, and the oscillation 
continues indefinitely. Note that, if the network had started with node 1’s choosing and 
announcing 1d (having not yet seen an announcement of route 2d), and then node 2 
had chosen 21d (having seen route 1d announced before it chose and announced its 
own direct route 2d), then no further changes would occur, and the network would be 
in a stable configuration; thus, convergence and oscillations can depend on timing. 

A large body of networking research has addressed the problem of providing suffi- 
cient conditions on routing policies for the convergence of path-vector protocols. There 
is an inherent trade-off between the desired autonomy at the local level and robustness 
(in the sense defined above) at the global level. However, there is a known sufficient 
condition on policies, called no dispute wheel, that guarantees robust convergence 
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while allowing fairly expressive local routing policies. Any network instance on which 
a path-vector protocol might oscillate contains a dispute wheel and, more importantly, 
the absence of a dispute wheel means that the instance and every subinstance of it have 
unique stable route allocations to which the routing protocol converges, i.e., no dispute 
wheel implies robustness. The following definition provides an equivalent sufficient 
condition: 


Definition 14.3. Define two relations on permitted routes: 
(i) Let R; © Ro iff R; is a subpath of R» that ends at d. 
(ii) Let Rj Oo Ro iff Ji € N : Ry, Ry € P', andi prefers R; over Ro. 


Let @ = (©; U ©2)* be the transitive closure of G;, G2. Note that @ is inherently 
reflexive and transitive. 


An interdomain-routing instance has no dispute wheel iff R, @ Rz and Ro @ R; 
together imply that R;, R» start at the same node. (Informally, this is antisymmetry of 
@ except that ties are allowed in valuations.) 

Let us revisit the example in Figure 14.4. Recall that, on this instance, path-vector 
protocols may oscillate forever. This anomaly is manifested by the following dispute 
wheel: 


ld ©, 21d ©2 2d ©, 12d Gz 1d. 


So far, our discussion of interdomain routing has focused on traditional networking 
concerns. We now consider the problem from a mechanism-design perspective. 


14.3.2 Mechanism-Design Perspective 


The policy autonomy in BGP, which was previously allowed to be an arbitrary choice, 
can be seen as expressing a preference that an AS is selfishly trying to satisfy. To do so, 
we let each source node i have a private valuation function v; : S' > Rso, where Si 
is the set of all simple (noncyclic) routes from i to d in the complete graph we get by 
adding links to G.° The valuation function v; specifies the “monetary value” of each 
route to source node i. We assume that v;(4) = O and that, for all pairs of routes R, 
and R» through different neighboring nodes, v;(R,) # v;(R2).° The routing policy of 
each node i is thus captured by 1. 

While each individual AS is trying to optimize its individual welfare, society as a 
whole has an interest in reaching a globally desirable outcome. While there are many 
goals one could choose, we shall focus here on social-welfare maximization. A route 


w 


Because we do not assume that nodes know the network topology, we cannot assume that they can distinguish 
valid routes from invalid ones. Thus, the valuation functions are defined over the complete graph to model the 
possibility of nodes’ announcing nonexistent routes. 

This assumption is consistent with current interdomain routing: Because at most one route to each destination 


a 


can be installed in a router’s forwarding table, nodes have some way to break ties, e.g., based on the next hop’s 
IP address; so, valuations can be adjusted accordingly to match this. However, because only one route per 
neighbor is considered at a time, ties in valuation are permitted for routes through the same neighboring node. 
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allocation 7, maximizes the social welfare if 


i=l 
If we view a routing protocol from a mechanism-design perspective, it should satisfy 
the following two requirements: 


¢ Ifimplemented honestly, the protocol should maximize the social welfare. 
¢ The protocol should be incentive-compatible, in that no AS is motivated to deviate from 
the actions it is asked to perform. 


The precise definition of incentive compatibility needed in this setting depends on 
the nature of the solution concept (or economic equilibrium). We shall now discuss in 
detail the solution concept that we adopt for interdomain-routing mechanisms. Recall 
from Section 14.1 that DAMD poses inherently different strategic challenges from 
AMD, because, in the absence of a trusted center, the computation is performed by 
the strategic agents themselves. This allows the computational nodes to manipulate the 
mechanism strategically in ways other than “lying” about their private types. They can, 
for instance, alter the computation to their own benefit or refuse to pass messages if 
it suits their needs. In such a scenario, aiming for strategyproofness might be futile, 
because it is unlikely that there is a single computational behavior that is optimal no 
matter what the other agents do. 

A more suitable solution concept is ex-post Nash equilibrium. The need to settle for 
ex-post Nash, rather than strategyproofness, can be viewed as the cost of distributing 
mechanism computation among the agents. We shall now formally define ex-post Nash 
in a distributed setting: Consider a computational network with n nodes and a set of 
possible outcomes O. Each node i has a private type 6; € ©; and a utility function 
uj: OxO;—> R. 


Definition 14.4 A distributed mechanism d™ is a 3-tuple d@ = (X ,g, gs), 
where © = (21,..., 2) is the feasible strategy space of the nodes, g : X — Ois 
the outcome function computed by the mechanism, and s“ = (s",...,sM@) eX 
is the prescribed strategy. 


For every node i, s“ € &; can be thought of as the algorithm that the mechanism 
designer intends i to execute. s” is parameterized by the private type 6; of the node 
i, with s“(6;) specifying which actions node i should perform in every state of the 
mechanism and network, given that its type is ;. 


Definition 14.5 A strategy profile s* € & is an ex-post Nash equilibrium of a 
distributed mechanism d” = (x, g , s™), if 


Ui(g(s{ 1), «++» 5,(n)), 81) = Ui g(sT (AL), «- 5 5;(6%), -- +5 S7(On)), 6) 


for every node i, for every possible strategy s; € ;, for every possible 6;, and for 
all possible private types 01, ..., 0-1, 0:41, ..., A of the other nodes. 
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Although weaker than a dominant-strategy equilibrium, ex-post Nash equilibrium 
is a fairly strong solution concept; it does not require strategic agents to have any 
knowledge of or to make any assumptions about the private types of other agents. 
Contrast this with the standard Nash-equilibrium concept, in which agents are assumed 
to know the private types of other agents; in the interdomain-routing context, this would 
mean that ASes are assumed to know the local routing policies of other ASes, which 
is certainly unrealistic. 

The ex-post Nash equilibrium solution concept is susceptible to collusion.’ That is, 
while it is true that unilateral deviation by an AS from the prescribed strategy profile 
cannot benefit it, coordinated deviation by several ASes might prove to be beneficial 
to some. Therefore, if at all possible, we would like our mechanisms to ensure that no 
deviation by a group of ASes from the prescribed strategy profile is worthwhile. To 
achieve this, we introduce collusion-proof ex-post Nash equilibria. In a collusion-proof 
ex-post Nash equilibrium, no deviation by a group of agents can strictly improve the 
outcome of even a single agent in that group without strictly harming another. 


14.3.3 A DAMD Approach: Combining the Two Perspectives 


To achieve incentive-compatible interdomain routing, we must design a protocol that 
makes sense from both the networking and the mechanism-design perspectives. The 
networking requirements point to a path-vector framework combined with a class of 
routing preferences that guarantees convergence. Mechanism design requires that we 
incent agents to implement this routing protocol faithfully. Incentive compatibility is 
often achieved through payments; however, below we show that, under a reasonable set 
of assumptions about routing policies, one can achieve collusion-proof ex-post Nash 
equilibrium without payments simply by executing BGP. 


14.3.3.1 Commercial Internet Routing and the Gao-Rexford Model 


There are two types of business relationships that characterize most AS intercon- 
nections: customer-provider and peering. Customer ASes pay their provider ASes for 
connectivity, and peers are AS pairs that find it mutually advantageous to exchange traf- 
fic for free. One advantage of peering is that the two peers need not pay their respective 
providers to exchange traffic directly. An AS can be in many different relationships si- 
multaneously: It can be a customer of one or more ASes, a provider to others, and a peer 
to yet others. These agreements are assumed to be relatively long-term contracts that 
are formed because of various external factors, e.g., traffic patterns and network sizes. 

These business relationships naturally induce the following constraints on routing 
policies, known as the Gao—Rexford constraints: 


No customer-provider cycles: Let Gcp be the digraph with the same set of nodes as 
G and with a directed edge from every customer to its provider. The Gao—Rexford 
constraints require that there be no directed cycles in this graph. This requirement is 
a natural economic assumption, because a cycle in Gcp implies that at least one AS 
is (indirectly) its own provider. 


7 The Nash equilibrium and dominant-strategy equilibrium concepts are also susceptible to collusion. 
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v4(432d) = 1+a 
v4(431d) = 0 


v3(31d) = 1 
u3(32d) = 
vi(1d) = 1 U9q(2d) = 1 
v1 (132d) = 0 v2(231d) = 0 


Figure 14.5. A routing instance that satisfies the Gao—Rexford constraints on which every 
path-vector protocol converges to a route allocation that is arbitrarily far from optimal. 


Prefer customers over peers and peers over providers: A customer route is a route 
in which the next-hop AS is a customer. Provider and peer routes are defined sim- 
ilarly. Because, typically, customers pay providers for service, and peers exchange 
service for free, the Gao—Rexford constraints require that nodes always prefer (1.e., 
assign a higher value to) customer routes over peer routes, which are in turn preferred 
over provider routes. 

Provide transit services only to customers: Transit service is carrying packets that 
originate and terminate at hosts outside the node. ASes are paid to carry customer 
packets but are not paid to carry peer or provider traffic. The Gao—Rexford con- 
straints require that ASes not carry transit traffic between their providers and peers. 
Therefore, ASes should announce only customer routes to their providers and peers 
but should announce ai// of their routes to their customers. 


These constraints ensure robustness without requiring coordination between ASes. 
In fact, if all ASes obey the Gao—Rexford constraints, then their valuations cannot 
induce a dispute wheel. 

The Gao-Rexford constraints ensure robust convergence, but in general they do not 
guarantee that BGP converges to the social-welfare-maximizing route allocation. To see 
this, consider the example in Figure 14.5. Assume that d is a customer of | and 2, that 1 
and 2 are customers of 3, that 3 is a customer of 4, and that a > 0. Observe that this AS 
graph satisfies all the Gao—Rexford constraints. The unique stable route allocation (to 
1,...,4, respectively) is {1d, 2d, 31d, 431d}. However, the optimal route allocation is 
{1d, 2d, 32d, 432d}. This allocation will never be chosen by local decisions, because 
node 3 would much prefer routing through node 1, a route that is always available for 
it to choose. Therefore, because the value of a can be arbitrarily high, this implies that 
the route allocation computed by a path-vector protocol could be arbitrarily far from 
the welfare-maximizing route allocation. 

This problem can be overcome by imposing the policy-consistency property. 


Definition 14.6 Policy consistency holds iff, for every two adjacent nodes i, j € 
N, and every two routes {Q, R} C P/ such that {(i, 7)O, (i, j)R} C P’® (in 


8 (i, j)Q and (i, j)R are the routes from i to d that have (i, ) as a first link and then follow Q and R, respectively. 
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particular, node i is not on Q or R), 


if vj(Q) = vj(R), then v(@, f)Q) = ui, JR). 


Informally, policy consistency holds if, for every two neighboring nodes i, j, such 
that j is i’s next-hop node on two routes, we have that, if j weakly prefers one route 
over another, then so must i. The policy-consistency property holds in the two most 
well studied special cases of interdomain routing. The first is the case in which the 
valuation of a route is solely a function of the route’s next hop. (These are called 
“next-hop policies.”) The second is the case in which there is some metric function that 
assigns a “length” to every link, and every valuation function prefers “shorter” routes 
(1.e., those with smaller total lengths in this metric). (These are called ““metric-based 
policies.”) 

We are now ready to state, and prove, the following theorem. 


Theorem 14.7 [f the Gao—Rexford constraints and policy consistency hold, then 
BGP converges to the social-welfare-maximizing route allocation and is incentive- 
compatible in collusion-proof ex-post Nash equilibrium (without any monetary 
transfer). 


PROOF We will actually prove a result that is stronger in two senses: First, we shall 
prove our result in the more general setting in which the valuation functions do not 
induce a dispute wheel, and policy consistency holds. Second, we shall prove that BGP 
actually converges to a solution (an allocation of routes) in which every AS gets its 
most desired route to the destination. That is, every AS will be assigned a route that 
maximizes its valuation function. We call this kind of route allocation a /ocally optimal 
solution. Observe that any locally optimal solution is also globally optimal in that it 
maximizes the total social welfare. Moreover, locally optimal solutions are deviation- 
proof in that there is no deviation by a group of agents that can strictly improve the 
outcome of even a single agent. This is far stronger than collusion-proof ex-post Nash 
equilibrium, which only requires that no deviation by a group of agents can strictly 
improve the outcome of a single agent in the group without strictly harming another 
agent in the group. 

Because the Gao—Rexford constraints imply that there is no dispute wheel, we are 
assured (by the result mentioned in Section 14.3.1) that BGP will converge to a unique 
stable solution. We denote this solution by Ty = {5S,,..., S,}, where S; is the route 
allocated to node i. 


Lemma 14.8 = /f the valuation functions do not induce a dispute wheel, and 
policy consistency holds, then BGP converges to a unique stable, locally optimal 
route allocation Tyg. 


PROOF Consideranodem € N.Let R = uguz_1...uj ...Uo be some loop-free 
route in P“, such that uw; = m and ug = d. By induction, we show foreachu; € R 
that $;, the solution’s route for node u; in Ty, is at least as good as R; = u; ... uo. 
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Ifi = m, then S,, is at least as good as R; because R and m were chosen arbitrarily, 
this establishes the local optimality of Ty. 


Base case. i = 0. The induction hypothesis is trivially true, because the only route 
is the empty one. 


Induction step. Assume that the induction hypothesis is true for u;—1, 1.e., 
Vu;_,CSi-1) 2 Vu;_, CRi-1)- (14.1) 
Note that u; does not lie on R;_;, because R is loop-free. 


Case I. Assume that u; ¢ S;-;. Then extend $;_; and R;_; along the edge 
(uj, Uj—1). (Uj, Ui-1) Si-1 € P“; thus, from (14.1) and policy consistency, we 
have 


Vy, (Cj, Ui-1) Si—1) = Vy, (Ri). (14.2) 
Ty is stable; so, S; is at least as good as any other route at u;; in particular, 
Vy,(S;) 2 Vu; (Ui, Ui-1) Si-1). (14.3) 
Combining (14.2) and (14.3) gives 
Uy; (Si) 2 Vu; (Ri), 
which is the induction statement for u;. 


Case II. Assume that u; € S;_,. We cannot use the policy-consistency argument 
as in Case I, because extending S;_; to u; creates a loop. This implies that 
u;—, ¢ S;. Suppose that the induction statement is not true for i, i.e., that v,,(R;) > 
v,, (S;). Then R; G2 S;. Because u;_; ¢ S; butu; € S;_),itmust be that $; 6, S;—1. 
From the induction hypothesis, S;_; G2 R;-1, and, because R; = (u;, uj_1) Ri-1, 
R;_-; ©, R;. Therefore, we have a cycle in the relation @; in particular, we can 
say that R; @ R;_; and R;_; @ R;, but these routes do not start at the same node. 
This violates the no-dispute-wheel property and shows that the assumption that 
v,,(R;) > Vy,(S;) leads to a contradiction. Therefore, v,,(R;) < v,,(S;), which is 
the induction statement for u;. (Recall that there are no ties in valuations.) 


Remark 14.9 Lemma 14.8 holds for every subinstance of the AS graph, be- 
cause both the Gao—Rexford constraints and policy consistency hold for every 
subinstance. 


Remark 14.10 No dispute wheel implies a unique collusion-proof ex-post Nash 
solution to which BGP converges. Hence, we are not concerned with the standard 
problem that arises when multiple equilibria exist, namely whether nodes select 
the same equilibrium. 


14.4 Conclusion and Open Problems 


In this chapter, we have reviewed the work that has been done on distributed algorithmic 
mechanism design, in which the presence of strategic computational agents introduces 
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new incentive and computational challenges for distributed computing. In particular, 
we have presented in detail some of the known results about DAMD for interdomain 
routing, which is the best motivated and most extensively studied problem in the area. 
There are at least two interesting directions for further research. 

First, there is the general question of which other problems in networked computa- 
tion are amenable to the approaches explored in this chapter. Several good candidates 
have been proposed, i.e., web caching, peer-to-peer file sharing, overlay-network con- 
struction, and distributed task allocation. Although both distributed algorithms and 
incentive compatibility have been considered in the literature about these problems, 
the results have not been pulled together into a coherent DAMD theory. The construc- 
tion of such a theory remains a worthy goal. 

Second, there are many questions about interdomain routing that have not been fully 
answered. There is still no complete characterization of the conditions under which BGP 
converges robustly. (“No dispute wheel” is sufficient but not known to be necessary.) 
Similarly, the conditions under which collusion-proof ex-post Nash equilibrium is 
reached simply by executing BGP have not been characterized completely. (Again, the 
Gao-Rexford and policy-consistency conditions presented in this chapter are sufficient 
but not known to be necessary.) In fact, necessary and sufficient conditions on AS 
graphs and routing policies have not yet been obtained for ex-post Nash equilibrium, 
even if we ignore collusion and allow payments. Both policy consistency and local 
optimality play an essential role in the main result presented in this chapter, and little is 
known about what can be obtained without them. In general, the network complexity 
of BGP is open, even in cases when convergence is assured. 


14.5 Notes 


Given the distributed and autonomous nature of Internet users, it is no surprise that the 
networking and distributed-systems literature provides some of the earliest applications 
of game theory and mechanism design to computer-science problems. These themes 
were first explored in an early series of papers from Columbia University, e.g., Ferguson 
(1989), Hsiao and Lazar (1988), Kurose et al. (1985), Kurose and Simha (1989), 
Mazumdar and Douligeris (1992), and Yemini (1981), which were followed by contri- 
butions from Miller and Drexler (1988a, 1988b), Sanders (1986, 1988a, 1988b), and 
others (Kelly, 1997; Kelly et al., 1998; La and Anantharam 1997; Murphy and Murphy, 
1994; Mackie-Mason and Varian, 1995; Shenker, 1990, 1995). Because networking 
problems are inherently distributed, and network protocols must have reasonable net- 
work complexity, these papers were actually early forerunners of DAMD. 

Nisan and Ronen were the first to combine algorithmic and economic concems in 
a new area of study for which they coined the term “algorithmic mechanism design,” 
and this book is largely an outgrowth of their seminal paper Nisan and Ronen (2001). 
The extension of AMD to DAMD was first explored in Feigenbaum et al. (2001), 
which considered the multicast cost-sharing problem described in Section 14.2 and 
articulated the notion of network complexity; the DAMD agenda was more broadly 
described soon thereafter in Feigenbaum and Shenker (2002). Subsequent work on 
DAMD for multicast cost sharing can be found in, e.g., Archer et al. (2004), Adler and 


BIBLIOGRAPHY 381 
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that guarantee optimality, both global and local; a result that is similar to (but weaker 
than) Lemma 14.8 is presented in Sobrinho (2005). 

For basic background on Internet routing, see Kurose and Ross (2005), Peterson and 
Davie (2003), or other networking textbooks. 
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Exercises 


14.1. Recall from Chapter 9 that, in a second-price Vickrey auction of a single item, 
the item is sold to the highest bidder, and the price that the winner pays is the 
second-highest bid. Consider a network in which there is one bidder at each node, 
and the nodes lie on a cycle. As in Section 14.2, we assume that there is no 
trusted center to implement an algorithm but that there is a central enforcer that 
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can implement the outcome decided upon by the agents and can impose severe 
penalties if the agents do not agree on an outcome. Give a distributed algorithm 
for computing the winner and the price in a second-price Vickrey auction on such 
a network that has the following properties: (i) it is incentive-compatible in ex-post 
Nash equilibrium; (ii) it requires no more than two messages to cross each link; 
and (iii) each message is at most O(log m + log n) bits long, where m is the highest 
bid, and n is the number of bidders. Prove that your algorithm satisfies these three 
properties 

Prove that, in the MC multicast cost-sharing mechanism, there is a single “largest” 
receiver set that maximizes NW. 


Prove the correctness of the algorithm given in Section 14.2.2 for computation of 
MC cost shares. 


A strategyproof mechanism is group strategyproof (GSP) if no coalition of deviating 
agents can achieve an outcome that is at least as good for all deviating agents and 
strictly better for at least one. For each of the MC and SH multicast cost-sharing 
mechanisms, either prove that it is GSP or provide a counterexample. 


Consider a single-item, ascending-price auction with “jump bids.” Type 6; denotes 
agent ’s value for the item. Bids are associated with a “bid price.” In round t, the 
auctioneer announces an “ask price” p' that is€ > 0 above the highest bid received 
so far. Any agent can bid in round t, as long as the bid is at some price at or above 
p'. The provisional winner is the agent with the current highest bid (breaking ties at 
random). The auction terminates when no agent bids at the current ask price, and 
the item is then sold to the provisional winner at its final bid price. The information 
state (p', x') defines the current ask price p‘ and provisional winner x‘ € {1,..., n}. 
The following is a straightforward bidding strategy that determines what agent / 
will do in state (p, x): If p < 6; and x 4/, then bid p; otherwise, do not bid. Prove 
that this strategy profile is an ex-post Nash equilibrium but not a dominant-strategy 
equilibrium. 

Prove that policy consistency is satisfied if all ASes use next-hop policies, or if all 
use metric-based policies. 


Give an interdomain-routing instance (i.e., an AS graph in which one AS is identi- 
fied as the destination, each edge is identified as a peer edge or a customer-provider 
edge, and a valuation function is given for each source AS) that does not contain a 
dispute wheel but also does not satisfy the Gao—Rexford constraints. Explain why 
the Gao-Rexford constraints are not satisfied by this instance. 


Prove that, in the interdomain-routing problem, it is NP-hard to find a route allo- 
cation that comes within a constant factor of the maximum social welfare if no 
restrictions are made on the valuation functions. 


CHAPTER 15 


Cost Sharing 


Kamal Jain and Mohammad Mahdian 


Abstract 


The objective of cooperative game theory is to study ways to enforce and sustain cooperation among 
agents willing to cooperate. A central question in this field is how the benefits (or costs) of a joint 
effort can be divided among participants, taking into account individual and group incentives, as well 
as various fairness properties. 

In this chapter, we define basic concepts and review some of the classical results in the cooperative 
game theory literature. Our focus is on games that are based on combinatorial optimization problems 
such as facility location. We define the notion of cost sharing, and explore various incentive and 
fairness properties cost-sharing methods are often expected to satisfy. We show how cost-sharing 
methods satisfying a certain property termed cross-monotonicity can be used to design mechanisms 
that are robust against collusion, and study the algorithmic question of designing cross-monotonic 
cost-sharing schemes for combinatorial optimization games. Interestingly, this problem is closely 
related to linear-programming-based techniques developed in the field of approximation algorithms. 
We explore this connection, and explain a general method for designing cross-monotonic cost-sharing 
schemes, as well as a technique for proving impossibility bounds on such schemes. We will also 
discuss an axiomatic approach to characterize two widely applicable solution concepts: the Shapley 
value for cooperative games, and the Nash bargaining solution for a more restricted framework for 
surplus sharing. 


15.1 Cooperative Games and Cost Sharing 


Consider a setting where a set A of n agents seek to cooperate in order to generate value. 
The value generated depends on the coalition S of agents cooperating. In general, the 
set of possible outcomes of cooperation among agents in S C A is denoted by V(S), 
where each outcome is given by a vector in R°, whose i’th component specifies the 
utility that the agent i € S derives in this outcome. The set A of agents along with the 
function V defines what is called a cooperative game (also known as a coalitional game) 
with nontransferable utilities (abbreviated as an NTU game). A special case, called a 
cooperative game with transferable utilities (abbreviated as a TU game), is when the 
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Figure 15.1. An example of the facility location game. 


value generated by a coalition can be divided in an arbitrary way among the agents in S. 
In other words, a TU game is defined by specifying a function v: 24 ++ R, which gives 
the value v(.S) € R generated by each coalition S. We assume v(¥) = 0. The set of all 
possible outcomes in such a game is defined as V(S) = {x € R*: ies i = vCS)}. 

The notion of a cooperative game was first proposed by von Neumann and 
Morgenstern. This notion seeks to abstract away all other aspects of the game ex- 
cept the combinatorial aspect of the coalitions that can form. This is in contrast with 
noncooperative games, where the focus is on the set of choices (moves) available to 
each agent. 

Note that in the definition of a cooperative game, we did not restrict the values to 
be nonnegative.! In fact, the case that all values are nonpositive is the focus of this 
chapter, as it corresponds to the problem of sharing the cost of a service among those 
who receive the service (this is by taking the value to be the negative of the cost). Again, 
the cost-sharing problem can be studied in both the TU and the NTU models. The TU 
model applies to settings where, for example, a service provider incurs some (monetary) 
cost c(S) in building a network that connects a set S of customers to the Internet, and 
needs to divide this cost among customers in S. In practice, the cost function c is often 
defined by solving a combinatorial optimization problem. One example, which we will 
use throughout the chapter, is the facility location game defined below. 


Definition 15.1 In the facility location game, we are given a set A of agents 
(also known as cities, clients, or demand points), a set F of facilities, a facility 
opening cost f; for every facility i € F, and a distance d;; between every pair 
(i, j) of points in AU F indicating the cost of connecting j to i. We assume 
that the distances come from a metric space; i.e., they are symmetric and obey 
the triangle inequality. For a set S C A of agents, the cost of this set is defined 
as the minimum cost of opening a set of facilities and connecting every agent 
in S to an open facility. More precisely, the cost function c is defined by c($) = 


mingcr {ier ier vies minjeF dij}. 


Example 15.2 Figure 15.1 shows an instance of the facility location game with 
3 agents {a, b,c} and 2 facilities {1, 2}. The distances between some pairs are 
marked in the figure, and other distances can be calculated using the triangle 


' Tf all values are nonnegative, the problem is called a surplus sharing problem. 
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inequality (e.g., the distance between facility 1 and client c is 2+1+1=4). 
The cost function defined by this instance is the following: 


c({a}) =4, c({b}) =3, ec({c}) = 3, 
c({a, b}) =6, c({b,c})=4, c(C{a,c})=7, c({a,b,c}) =8. 


Since the monetary cost may be distributed arbitrarily among the agents, it is natural 
to model the above example as a TU game. An example of a case where the NTU 
model is more applicable is a network design problem where the cost incurred by an 
agent 7 in the set of agents S receiving the service corresponds to the delay this agent 
suffers. There are multiple designs for the network connecting the customers in S, and 
each design corresponds to a profile of delays that these agents will suffer. The set of 
possible outcomes for the coalition S is defined as the collection of all such profiles, 
and is denoted by C(S). As delays are nontransferrable, this setting is best modeled 
as an NTU cost-sharing game. For another example of an NTU game, see the housing 
allocation problem in Section 10.3 of this book. 

As most of the work on cost sharing in the algorithmic game theory literature has 
so far focused on TU games, this chapter is mainly devoted to such games; henceforth, 
by a cost-sharing game we mean a TU game, unless otherwise stated. 


15.2 Core of Cost-Sharing Games 


A central notion in cooperative game theory is the notion of core. Roughly speaking, 
the core of a cooperative game is an outcome of cooperation among all agents where no 
coalition of agents can all benefit by breaking away from the grand coalition. Intuitively, 
the core of a game corresponds to situations where it is possible to sustain cooperation 
among all agents in an economically stable manner. 

In this section, we define the notion of core for cost-sharing games, and present 
two classical results on conditions for nonemptiness of the core. We show how the 
notion of core for TU games can be relaxed to an approximate version suitable for hard 
combinatorial optimization games, and observe a connection between this notion and 
the integrality gap of a linear programming relaxation of such problems. 


15.2.1 Core of TU Games 


Formally, the core of a TU cost-sharing game is defined as follows. 


Definition 15.3. Let (A, c) bea TU cost-sharing game. A vector a € R4 (some- 
times called a cost allocation) is in the core of this game if it satisfies the following 
two conditions: 

* Budget balance: >? j.4@j; = c(A). 


* Core property: for every SC A, )°..,aj < c(S). 


jes 
Example 15.4 As an example, consider the facility location game of Exam- 
ple 15.2 (Figure 15.1). It is not hard to verify that the vector (4, 2, 2) lies in the 
core of this game. In fact, this is not the only cost allocation in the core of this 
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game; for example, (4, 1, 3) is also in the core. On the other hand, if a third facility 
with opening cost 3 and distance 1 to agents a and c is added to this game, the 
resulting game will have an empty core. To see this, note that after adding the 
third facility, we have c({a, c}) = 5. Now, if there is a vector @ in the core of this 
game, we must have 


Qa + ay < c({a, b}) = 6 
ay + ae < c({b, c}) =4 


Aa + Ae < c({a, c}) = 5 


By adding the above three inequalities and dividing both sides by 2, we obtain 
Ag + ay +a. < 7.5 < c({a, b, c}). Therefore « cannot be budget balanced. 


A classical result in cooperative game theory, known as the Bondareva—Shapley 
theorem, gives a necessary and sufficient condition for a game to have nonempty core. 
To state this theorem, we need the following definition. 


Definition 15.5 A vector A that assigns a nonnegative weight A5 to each 
subset S C A is called a balanced collection of weights if for every j € A, 


Ds:jes As = 1. 


Theorem 15.6 A cost-sharing game (A, c) with transferable utilities has a 
nonempty core if and only if for every balanced collection of weights X, we have 


Visca rsc(S) = c(A). 


PROOF By the definition of the core, the game (A, c) has a nonempty core if 
and only if the solution of the following linear program (LP) is precisely c(A). 
(Note that this solution can never be larger than c(A).) 


Maximize )) 424%; 


(15.1) 
Subjectto VS CA: ies aj; < c(S). 


By strong LP duality, the solution of the above LP is equal to the solution of the 
following dual program: 


Minimize 5° Asc(S) 
SCA 


Subjectto Vie A: Yo As=l (15.2) 
SijeS 
WSC AzAs > 0. 


Therefore, the core of the game is nonempty if and only if the solution of the 
LP (15.2) is equal to c(A). By definition, feasible solutions of this program 
are balanced collections of weights. Therefore, the core of the game (A, c) is 
nonempty if and only if for every balanced collection of weights (As), )osc4 
Asc(S) = c(A). 7 
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As an example, the proof of emptiness of the core given in Example 15.4 can be 
restated by defining a vector 1 as follows: Ata.p) = Atycy = Afa,c} = 5 and X45 = 0 for 
every other set S. It is easy to verify that 2 is a balanced collection of weights and 


Visca Asc(S) < c(A). 


15.2.2 Approximate Core 


As we saw in Example 15.4, a difficulty with the notion of core is that the core of 
many cost-sharing games, including most combinatorial optimization games based on 
computationally hard problems, is often empty. Furthermore, when the underlying cost 
function is hard to compute (e.g., in the facility location game), even deciding whether 
the core of the game is empty is often computationally intractable. This motivates the 
following definition. 


Definition 15.7 A vector « € R* is in the y-approximate core (or y-core, for 
short) of the game (A, c) if it satisfies the following two conditions: 


< c(A). 


jes %j S c(S). 


* y-Budget balance: yc(A) < ))j.4@) 
* Core property: for every S C A, }> 


For example, in the facility location game given in Example 15.4, the vector 
(3.5, 2.5, 1.5) is in the 73 core of the game. Note that the argument given to show 
the emptiness of the core of this game actually proves that for every y > _ the 
y-core of this game is empty. 

The Bondareva—Shapley theorem can be easily generalized to the following approx- 
imate version. 


Theorem 15.8 = For every y < 1, a cost-sharing game (A, c) with transferable 
utilities has a nonempty y-core if and only if for every balanced collection of 
weights h, we have ¥)s-4Asc(S) = yc(A). 


The proof is similar to the proof of the Bondareva-Shapley theorem and is based 
on the observation that by LP duality, the y-core of the game is nonempty if and only 
if the solution of the LP (15.2) is at least yc($). Note that if the cost function c is 
subadditive (i.e., c(S; U Sx) < c(S,) + c(S2) for any two disjoint sets S$; and S>), then 
the optimal integral solution of the LP (15.2) is precisely c(A). Therefore, 


Corollary 15.9 For any cost-sharing game (A, c) with a subadditive cost func- 
tion, the largest value y for which the y-core of this game is nonempty is equal 
to the integrality gap of the LP (15.2). 


As it turns out, for many combinatorial optimization games such as set cover, vertex 
cover, and facility location, the LP formulation (15.2) is in fact equivalent to the 
standard (polynomial-size) LP formulation of the problem, and hence Corollary 15.9 
translates into a statement about the integrality gap of the standard LP formulation of 
the problem. Here, we show this connection for the facility location game. 
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We start by formulating the facility location problem as an integer program. In this 
formulation, x; and y;; are variables indicating whether facility i is open, and whether 
agent j is connected to facility 7. 


Minimize )ijer fiti t+ Vier Vien didi 
Subjectto VjeA: Die yij > 1 
cae (15.3) 
Vie F, jEA: Xi = Vij 
Vi Ee F,j eA: Xis Vij € {0, 1}. 
By relaxing the second constraint to x;, yi; => 0 we obtain an LP whose dual can be 
written as follows: 


Maximize )) 4.4%; 

Subjectto Vie F,jeA: Bi >a; —dij 
WEF? Dei = fi 
Vie F, jEA: aj, Bij =O 


(15.4) 


Note that we may assume without loss of generality that in a feasible solution of the 
above LP, 6;; = max(0, a; — dj;). Thus, to specify a dual solution it is enough to give 
a. We now observe that the above dual LP captures the core constraint of the facility 
location game; 1.e., it is equivalent to the LP (15.1). 


Proposition 15.10 = For any feasible solution (a, B) of the LP (15.4), « satisfies 
the core property of the facility location game. 

PROOF We need to show that for every set § C A, ies a; < c(S), where c(S) 
is the cost of the facility location problem for agents in S. First we note that for any 
facility i and set of agents R C .A, by adding the first and the second inequalities 
of the LP (15.4) for facility i and every 7 € R we obtain 


dias < fit > ody. C55) 
JER JER 


Now, consider an optimal solution for the set of agents S, and assume i), ..., ix 
are facilities that are open and R, is the set of agents served by facility i, in this 
solution. Summing Inequality (15.5) for every (iz, Re) will yield the result. 


By the above proposition, the solution of the dual LP (15.4) (which, by LP duality, 

is the same as the LP relaxation of (15.3)) is equal to the solution of the LPs (15.1) 

and (15.2). Furthermore, the optimal integral solution of (15.3) is c(A). Therefore, the 

integrality gap of (15.3) is the same as that of (15.2) and gives the best budget balance 

factor that a cost allocation satisfying the core property can achieve. The best known 

results in the field of approximation algorithms show that this gap (in the worst case) 
1 


. 1 
is between Tay and ta. 
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15.2.3 Core of NTU Games 


We conclude this section with a classical theorem due to Scarf, which gives a sufficient 
condition for the nonemptiness of the core of NTU games similar to the one given in 
Theorem 15.6 for TU games. However, unlike in the case of TU games, the condition 
given in Scarf’s theorem is not necessary for the nonemptiness of the core. 

Formally, the core of an NTU cost-sharing game (A, C) is the collection of all 
cost allocations w € C(A) such that there is no nonempty coalition S C A and cost 
allocation x € C(S) for which x; < a; forall j € S. Note that this definition coincides 
with Definition 15.3 in the case of a TU game. In the following theorem the support of 
a balanced collection of weights 2 denotes the collection of all sets S with As > 0. 


Theorem 15.11 = Let (A, C) be a cost-sharing game with nontransferable utili- 
ties. Assume for every balanced collection of weights i and every vector x € RA 
the following property holds: if for every set S in the support of X, the restriction of 
x to the coordinates in S is in C(S), then x € C(A). Then (A, C) has a nonempty 
core. 


The proof of the above theorem, which is beyond the scope of this chapter, uses an 
adaptation of the Lemke—Howson algorithm for computing Nash equilibria (described 
in Section 3.4 of this book), and is considered an early and important contribution of 
the algorithmic viewpoint in game theory. However, the worst case running time of 
this algorithm (like the Lemke—Howson algorithm) is exponential in |.A|. This is in 
contrast to the proof of the Bondareva—Shapley theorem, which gives a polynomial-time 
algorithm’ for computing a point in the core of the game, if the core is nonempty. 


15.3 Group-Strategyproof Mechanisms and Cross-Monotonic 
Cost-Sharing Schemes 


The cost-sharing problem defined in this chapter models the pricing problem for a 
service provider with a given set of customers. In settings where the demand is sensitive 
to the price, an alternative choice for the service provider is to conduct an auction 
between the potential customers to select the set of customers who can receive the 
service based on their willingness to pay and the cost structure of the problem. The 
goal is to design an auction mechanism that provides incentives for individuals as well 
as groups of agents to bid truthfully. In this section, we study this problem and exhibit 
its connection to cost sharing. 

We start with the definition of the setting. Let A be a set of n agents interested in 
receiving a service. The cost of providing service is given by a cost function c: 24 1 
IR* U {0}, where c(S) specifies the cost of providing service for agents in S. Each agent 
i has a value u; € R for receiving the service; that is, she is willing to pay at most u; 
to get the service. We further assume that the utility of agent i is given by u;q; — x;, 


2 This is assuming a suitable representation for the cost function c, e.g., by a separation oracle for (15.1), or in 
combinatorial optimization games where statements like Proposition 15.10 hold. 
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where q; is an indicator variable that indicates whether she has received the service or 
not, and x; is the amount she has to pay. A cost-sharing mechanism is an algorithm that 
elicits a bid b; € R from each agent, and based on these bids, decides which agents 
should receive the service and how much each of them has to pay. More formally, 
a cost-sharing mechanism is a function that associates to each vector b of bids a set 
Q(b) C A of agents to be serviced, and a vector p(b) € R” of payments. When there 
is no ambiguity, we write Q and p instead of Q(b) and p(b), respectively. We assume 
that a mechanism satisfies the following conditions: 


¢ No Positive Transfer (NPT): The payments are nonnegative (i.e., p; > 0 for all 7). 

¢ Voluntary Participation (VP): An agent who does not receive the service is not charged 
(i.e., pj = Ofori ¢ Q), and an agent who receives the service is not charged more than 
her bid (i.e., p; < b; fori € Q) 


* Consumer Sovereignty (CS): For each agent 7, there is some bid b* such that if i bids 
b*, she will get the service, no matter what others bid. 


Furthermore, we want the mechanisms to be approximately budget-balanced. We 
call a mechanism y-budget-balanced with respect to the cost function c if the total 
amount the mechanism charges the agents is between yc(Q) and c(Q) (.e., ye(Q) < 
6 xi < c(Q)). 

We look for mechanisms, called group strategyproof mechanisms, which satisfy the 
following property in addition to NPT, VP, and CS. Let § C A bea coalition of agents, 
and u, u’ be two vectors of bids satisfying u; = wu; for every i ¢ S (we think of u as 
the values of agents, and wu’ as a vector of strategically chosen bids). Let (Q, p) and 
(Q’, p’) denote the outputs of the mechanism when the bids are u and uw’, respectively. 
A mechanism is group strategyproof if for every coalition S of agents, if the inequality 
uig; — p; > uigi — pi holds for everyi € S, then it holds with equality for every i € S. 
In other words, there should not be any coalition S and vector u’ of bids such that if 
members of S announce uw’ instead of u (their true value) as their bids, then every 
member of the coalition S is at least as happy as in the truthful scenario, and at least 
one person is happier. 

Moulin showed that cost-sharing methods satisfying an additional property termed 
cross-monotonicity can be used to design group-strategyproof cost-sharing mecha- 
nisms. The cross-monotonicity property captures the notion that agents should not be 
penalized as the serviced set grows. To define this property, we first need to define the 
notion of a cost-sharing scheme. 


Definition 15.12 Let (A, c) denote a cost-sharing game. A cost-sharing scheme 
is a function that for each set S C A, assigns acost allocation for S. More formally, 
a cost-sharing scheme is a function €: A x 24 ++ R such that, for every S C A 
and every i ¢ S, €(i, S) = 0. We say that a cost-sharing scheme & is y-budget 
balanced if for every set S C A, we have yc(S) < a E(i, S) < c(S). 


Definition 15.13 A cost-sharing scheme é is cross-monotone if forall S, T C A 
andi € S,é(@i, $) > &(i, SUT). 
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Mechanism M1; 
Initialize § — A. 
Repeat 
Let S + {i € S$: b; => &G, S)}. 
Until for all i € S$, b; > &(i, S). 
Return Q = S and p; = &(i, S) for all i. 


Figure 15.2. Moulin’s group-strategyproof mechanism. 


The following proposition shows that cross-monotonicity is a stronger property than 
core. 


Proposition 15.14 Let & be an y-budget-balanced cross-monotonic cost shar- 
ing scheme for the cost-sharing game (A, c). Then &(., A) is in the y-core of this 
game. 


PROOF We need to verify only that &(., A) satisfies the core property, i.e., for 
every set SC A, 0; es (i, A) < c(S). By the cross-monotonicity property, for 
every i € S, &(i, A) < &(i, S). Therefore, )°,.. €(i, A) < Uses €G, S) < c(S), 
where the last inequality follows from the y-budget balance property of &. 


Given a cross-monotonic cost-sharing scheme & for the cost-sharing game (A, c), 
we define a cost-sharing mechanism M; as presented in Figure 15.2. 
The following proposition provides an alternative way to view the mechanism M;. 


Proposition 15.15 = Assume & is a cross-monotonic cost sharing scheme for 
the cost-sharing game (A, c), and b; € Rt U {0} for every i €¢ A. Then there 
is a unique maximal set S C A satisfying the property that for every i € S, 
bj => EG, S). The mechanism Mg returns this set. 


PROOF Assume that two different maximal sets S; and 5 satisfy the stated 
property, ie., b; > (i, S,) for every i € S; and b; > E(i, Sy) for every i € Sp. 
Then for every i € S;, b; => €(i, S;) => EG, S,; U Sz), where the last inequality 
follows from cross-monotonicity of €. Similarly, for every i € Sz, b; => €(i, S, U 
S2). Therefore, the set S$; U S» also satisfies this property. This contradicts with 
the maximality of S, and S). 

Let S* denote the unique maximal set satisfying b; > &(i, S*) for all i € S*. 
We claim that MM; never eliminates any of the agents in S* from the serviced set 
S. Consider, for contradiction, the first step where it eliminates an agent i € S* 
from the serviced set S. This means that we must have b; < &(i, S). However, 
since S* C S, by cross-monotonicity we have (i, S$) < &(i, S*), and hence b; < 
E(i, S*), contradicting the definition of S*. Therefore, the set Q returned by Mz 
contains S*. By maximality of S*, it cannot contain any other agent, that is, 


Q=S". 
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We are now ready to prove the following theorem of Moulin. 


Theorem 15.16 = /f & is an y-budget-balanced cross-monotonic cost-sharing 
scheme, then M, is group-strategyproof and y-budget balanced. 


PROOF Assume, for contradiction, that there is a coalition T of agents that 
benefits from bidding according to the vector u’ instead of their true values u. 
Agents in T can be partitioned into two sets T+ and T~ based on whether their 
bid in w’ is greater than their bid in u, or not. First, we claim that it can be 
assumed, without loss of generality, that T* is empty, i.e., agents cannot benefit 
from overbidding. To see this, we start from a bid vector where every agent in T 
bids according to w’ (and others bid truthfully), and reduce the bids of the agents 
in T* to their true value one by one. If at any step, e.g., when the bid of agent 
i € T* is reduced from u’ to u;, the outcome of the auction changes, then by 
Proposition 15.15, i must be a winner when she bids according to u’, and not a 
winner when she bids according to u. This means that u; > &(i, S;) > uj, where 
S; is the set of winners when i bids according to wu’. However, this means that 
the agent i must pay an amount greater than her true value in the scenario where 
every agent in T bids according to u’. This is in contradiction with the assumption 
that agents in T all benefit from the collusion. By this argument, the bid of every 
agent in T+ can be lowered to her true value without changing the outcome of 
the auction. Therefore, we assume without loss of generality that 7+ is empty. 
Now, let S’ and S denote the set of winners in the untruthful and the truthful 
scenarios (i.e., when agents bid according to u’ and wu), respectively. As the bid 
of each agent in w’ is less than or equal to her bid in u, by Proposition 15.15, 
S’ C S. By cross-monotonicity, this implies that the payment of each agent in the 
untruthful scenario is at least as much as her payment in the truthful scenario. 
Therefore, no agent can be strictly happier in the untruthful scenario than in the 
truthful scenario. 


Moulin’s theorem shows that cross-monotonic cost-sharing schemes give rise to 
group-strategyproof mechanisms. An interesting question is whether the converse also 
holds, i.e., is there a way to construct a cross-monotonic cost-sharing scheme given 
a group-strategyproof mechanism? The answer to this question is negative (unless 
the cost function is assumed to be submodular), as there are examples where a cost 
function has a group-strategyproof mechanism but no cross-monotonic cost-sharing 
scheme. A partial characterization of the cost-sharing schemes that correspond to 
group-strategyproof mechanisms in terms of a property called semi-cross-monotonicity 
is known; however, finding a complete characterization of cost-sharing schemes arising 
from group-strategyproof mechanisms remains an open question. 


15.4 Cost Sharing via the Primal-Dual Schema 


In Section 15.2.2, we discussed how a cost allocation in the approximate core of a game 
can be computed by solving an LP, and noted that for many combinatorial optimization 
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games, this LP is equivalent to the dual of the standard LP relaxation of the problem 
and the cost shares correspond to the dual variables. In this section, we explain how a 
technique called the primal-dual schema can be used to compute cost shares that not 
only are in the approximate core of the game, but also satisfy the cross-monotonicity 
property, and hence can be used in the mechanism described in the previous section. The 
primal-dual schema is a standard technique in the field of approximation algorithms, 
where the focus is on computing an approximately optimal primal solution, and the 
dual variables (cost shares) are merely a by-product of the algorithm. 

The idea of the primal-dual schema, which is often used to solve cost minimization 
problems, is to write the optimization problem as a mathematical program that can 
be relaxed into an LP (we refer to this LP as the primal LP). The dual of this LP 
gives a lower bound on the value of the optimal solution for the problem. Primal-dual 
algorithms simultaneously construct a solution to the primal problem and its dual. 
This is generally done by initially setting all dual variables to zero, and then gradually 
increasing these variables until some constraint in the dual program goes tight. This 
constraint hints at an object that can be paid for by the dual to be included in the primal 
solution. After this, the dual variables involved in the tight constraint are frozen, and 
the algorithm continues by increasing other dual variables. The algorithm ends when 
a complete solution for the primal problem is constructed. The analysis is based on 
proving that the values of the constructed primal and dual solutions are close to each 
other, and therefore they are both close to optimal. 

We will elaborate on two examples in this section: submodular games, where a simple 
primal-dual algorithm with no modification yields cross-monotonic cost-shares, and the 
facility location game, where extra care needs to be taken to obtain a cross-monotonic 
cost-sharing scheme. In the latter case, we introduce a rather general technique of 
using “ghost duals” to turn the standard primal-dual algorithm for the problem into an 
algorithm that returns a cross-monotonic cost-sharing scheme. 


15.4.1 Submodular Games 


Let us start with a definition of submodular games. 


Definition 15.17 A cost-sharing game (A, c) is called a submodular game if the 
cost function c satisfies 


VS,T CA, cS)4+c(T) > c(SUT)+c(SNT). 


The above condition is equivalent to the condition of decreasing marginal cost, 
which says that for every two agents i and j and every set of agents S C A \ {i, j}, 
the marginal cost of adding i to S (i.e., c(S U {i}) — c(S)) is no less than the 
marginal cost of adding i to S$ U {j} @e., c(S U {i, 7}) — cCS U {7})). Recall that 
we always assume c(@) = 0. 


3 Tn many primal-dual algorithms, a postprocessing step is required to bring the cost of the primal solution down. 
However, this step often does not change the cost shares. 
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Submodular games (also known as concave games) constitute an important class of 
cost-sharing games with many interesting properties. One example in this class is the 
multicast problem discussed in Section 14.2.2 of this book. 

Consider a submodular game (A, c), and the LPs (15.2) and (15.1) as the primal and 
the dual programs for this game, respectively. It is not hard to see that by submodularity 
of c, the solution of the primal program is always c(A), giving a trivial optimal 
solution for this LP. However, the dual LP (15.1) is nontrivial and its optimal solutions 
correspond to cost allocations in the core of the game. Let w be a feasible solution of 
this LP. We say that a set S C A is tight, if the corresponding inequality in the LP is 
tight, i.e., if }> jes %j = C(S). We need the following lemma to describe the algorithm. 

Lemma 15.18 = Let a be a feasible solution of the linear program (15.1). If two 

sets S,, Sy © Aare tight, then so is S, U Sp. 


PROOF Since @ is feasible, we have Do esas aj < c(S,M Sz). This, together 
with the submodularity of c and tightness of S, and S2, implies 


c(S U Sp) < c(S)) + c(S2) — e(S1 9 Sz) 


<) aj +> oy y aj 
Jes, JES2 JESINS2 
JES;US2 


Therefore, S; U S) is tight. 


Corollary 15.19 There is a unique maximal tight set. It is simply the union of 
all the tight sets. 


We are now ready to state the algorithm that computes the cost shares. This algorithm 
is presented in Figure 15.3. Notice that by Lemma 15.18, when a new set goes tight, 
the new maximal tight set contains the old one, and therefore once an element i € T 
is included in the frozen set F’, it will stay in this set until the end of the algorithm. 
Thus, the cost share a; at the end of the algorithm is precisely the first time at which 
the element j is frozen. Furthermore, note that the algorithm never allows « to become 
an infeasible solution of the LP (15.1), and stops only when the set T goes tight. 
Hence, the cost shares computed by the algorithm satisfy the budget balance and the 
core property. All that remains is to show that they also satisfy the cross-monotonicity 


property. 


Theorem 15.20 = The cost sharing scheme defined by the algorithm Submodu- 
larCostShare (Figure 15.3) is cross monotone. 


PROOF Let 7, C 7) C A. We simultaneously run the algorithm for 7, and 7>, 
and call these two runs the 7;-run and the 7>-run. It is enough to show that at any 
moment, the set of frozen elements in the T;-run is a subset of that of the 7-run. 
Consider a time f (i.e., the moment when all unfrozen cost shares in both runs are 


COST SHARING VIA THE PRIMAL-DUAL SCHEMA 397 


Algorithm SubmodularCostShare 


Input: submodular cost sharing game (A, c) 
set T C A of agents that receive the service 


Output: cost shares a; for every j ¢ T 


For every j, initialize a; to 0. 
Let F=@ 
While T \ F 4 @ do 
Increase all a;’s for j € T \ F at the same rate, 
until a new set goes tight. 
Let F be the maximal tight set. 


Figure 15.3. Algorithm for computing cost shares in a submodular game. 


equal to t), and let a! and F; denote the values of the variables and the frozen set 
at this moment in the J;-run, for / = 1, 2. We have 


CCF, U Fo) < c(Fi) + ca) — c( Fi Fo) 
=Dal+De- Yo} 
icF, i€F, i€FiNFo 
= 0 + Dal 
i€F\\ Fy i€F, 


=e eee 


i€F\|UFy 


where the first inequality follows from submodularity of c, the second follows 
from the tightness of F; with respect to a! (1 = 1,2) and the feasibility of w!, 
and the last follows from the fact that for every i € F, \ Fo, since i € T; C T 
and i is not frozen at time ¢ in the 7>-run, we have a? =t > a}. The above 
inequality implies that F, U F) is tight with respect to w”. Since by definition F) 
is the maximal tight set with respect to a”, we must have F; U F) = F). Hence, 
F, C Fh, as desired. 


15.4.2 The Facility Location Game 


We now turn to our second example, the facility location game, and observe how the 
standard primal-dual scheme for this problem fails to satisfy cross-monotonicity. 
Recall the LP formulation (15.3) of the facility location problem, and the observation 
that one can assume, without loss of generality, that in a solution to this program, we 
have 6;; = max(0, a; — dj;). In designing a primal-dual algorithm for the facility 
location game, we think of the quantity max(0, a; — dj;) as the contribution of agent j 
toward facility 7, and say that a facility i is tight with respect to the dual solution a, if 
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the total contribution i receives in w equals its opening cost, i.e., if > jo max(0, a; — 
dj;) = fi. 

Following the general paradigm of the primal-dual schema, let us consider the 
following algorithm for computing cost shares in the facility location game: initialize 
the cost shares a; to zero and gradually increase them until one of these two events 
occurs: a facility i goes tight, in which case, this facility is opened, and the cost shares 
of all agents 7 with positive contribution to i are frozen (i.e., no longer increased); or 
for an agent j and a facility i that is already opened, w; = dj;;, in which case the cost 
share of j is frozen. This process continues until all cost shares are frozen. 

To illustrate this algorithm, consider the facility location game in Example 15.2 
(Figure 15.1). In this example, at time 2, facility 2 goes tight as each of b and c makes 
one unit of contribution toward this facility. Therefore, the cost shares of b and c are 
frozen at 2. The cost share of the agent a continues to increase to 4, at which point 
facility 1 also goes tight and the algorithm stops. In this example, the cost allocation 
(4, 2, 2) computed by the algorithm is budget balanced and satisfies the core property. 
In fact, it is known that in every instance, with a postprocessing step that closes some of 
the open facilities, the cost of the primal solution can be brought down to at most 3 times 
the sum of cost shares, and therefore the cost shares are +-budget-balanced on every 
instance of the facility location game. However, unfortunately the cross-monotonicity 
property fails, as can be seen in the example in Figure 15.1. In this example, if only 
agents a and b are present, they will both increase their cost share to 3, at which point 
both facilities go tight and the algorithm stops. Hence, agent a has a smaller cost share 
in the set {a, b} than in {a, b, c}. 

Intuitively, the reason for the failure of cross-monotonicity in the above example is 
that without c, b helps a pay for the cost of facility 1. However, when c is present, she 
helps b pay for the cost of facility 2. This, in turn, hurts a, as b stops helping a as soon 
as facility 2 is opened. This suggests the following way to fix the problem: we modify 
the algorithm so that even after an agent is frozen, she continues to grow her ghost 
cost share. This ghost cost share is not counted toward the final cost share of the agent, 
but it can help other agents pay for the opening cost of facilities. For example, in the 
instance in Figure 15.1 when all three agents are present, even though agents b and c 
stop increasing their cost share at time 2, their ghost share continues to grow, until at 
time 3, facility 1 is opened with contributions from agents a and b. At this point, the 
cost share of agent a is also frozen and the algorithm terminates. The final cost shares 
will be 3, 2, and 2. The pseudo-code of this algorithm is presented in Figure 15.4. 
Variables a’ in this pseudo-code represent the ghost cost shares. 

With this modification, it is an easy exercise to show that the cost shares computed 
by the algorithm are cross-monotone. However, it is not clear that the budget balance 
property is preserved. In fact, it is natural to expect that having ghost cost shares that 
contribute toward opening facilities in the primal solution but do not count toward the 
final cost shares could hurt the budget balance (see, i.e., Exercise 15.3). For the facility 
location problem, the following theorem shows that this is not the case. 


Theorem 15.21 = The cost allocation computed by the algorithm FLCostShare 
(Figure 15.4) is i -budget balanced. 
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Algorithm FLCostShare 


Input: facility location game (A, c) defined by 
facility opening costs f; and distances dj; 
set T C A of agents that receive the service 


Output: cost shares w; for every j € T 


For every j, initialize both a; and a’, to 0. 
Let F = &. 
While T \ F 44 do 
Increase all a;’s for j € T \ F and a, ’s for j € T at the same rate, until 
e for an unopened facility i, Vier max(0, a — dij) = fi: 
in this case, open facility 7, and 
add every agent j with a positive contribution toward i to F; 
e for an open facility i and an agent j, a; = dj;: 
in this case, add j to F. 


Figure 15.4. Algorithm for computing cost shares in the facility location game. 


PROOF It is enough to show that for every instance of the facility location 
problem, it is possible to construct a solution whose cost is at most three times 
the sum of the cost shares computed by FLCostShare. To do this, we perform 
the following postprocessing step on the solution computed by FLCostShare. 
Let t; denote the time at which facility i is opened by FLCostShare, and order 
all facilities that are opened by this algorithm in the order of increasing t;’s. We 
proceed in this order and for any facility i, check if there is any open facility i’ that 
comes before i in this order and is within distance 21; of i. If such a facility exists, 
we close facility i; otherwise, we keep it open. After processing all facilities in 
this order, let ¥’ denote the set of facilities that remain open and connect each 
agent in T to its closest facility in F’. We now show that }> jer @j 1S enough to 
pay for at least one third of the cost of this solution. 

Let S; denote the set of agents within distance ¢; of facility i. First, observe 
that for any two facilities i and i’ in F’, S; and S;, are disjoint. This is because if 
there is an agent j in S; M S;, the distance between i and i’ is at most t; + tf; < 
2 max(t;, t;), and therefore one of i and i’ must have been closed in the above 
postprocessing step. To complete the proof, it is enough to show two statements. 
First, we show that for every facility i € F’, the cost shares of agents in S; is 
enough to pay for at least a third of their distances to 7 (and hence, to their closest 
facility in F’) plus the opening cost of i. Second, we show that each agent j that 
is not in U;<¢S; can pay for at least one third of its distance to its closest facility 
in F’. 

To prove the first statement, note that for every facility i €¢ F’, the ghost cost 
share of every agent contributing to i at the time of its opening is precisely #;; hence, 
jes, (ti — d;;) = f;. Therefore, if we show that for every j € S;, a; = t;/3, we 
would get >> jes, Cj = i( f+>d jes, Uj). Assume, for contradiction, that there is 
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anagent j witha; < 4/3 and consider the facility i; with ¢;, = a; (i.e., the facility 
whose opening has caused the cost share of j to freeze). There must be a facility 
in € F’ that is within distance 2¢;, of i; (if i; € F’, we can let iz = i,). Therefore, 
the distance between i and ip is at most dj; + dj,; + 21), < t + 3a; < 2t;. This 
contradicts the assumption that i € F’, since i comes after iz in the ordering and 
In € ae 

To show the second statement, consider an agent j € T \ Uje¢S;, and let i be 
the facility with ¢; = a;. There must be a facility i’ in F’ that is within distance 
21; of i (i’ can be the same as i). Therefore, the distance from j to its closest 
facility in F’ is at most dj; + 2t; < 3a;. 


15.5 Limitations of Cross-Monotonic Cost-Sharing Schemes 


As we saw in Section 15.3, a cost-sharing scheme that is cross-monotone also satisfies 
the core property. As a result, for any combinatorial cost-sharing game, an upper 
bound on the budget balance factor of cross-monotonic cost-sharing schemes can be 
obtained using Theorem 15.8 and the integrality gap examples of the corresponding 
LP. As we will see in this section, for many combinatorial optimization games, the 
cross-monotonicity property is strictly stronger than the core property, and better upper 
bounds on the budget balance factor of such games can be obtained using a technique 
based on the probabilistic method. 

The high-level idea of this technique is as follows: Fix any cross-monotonic cost- 
sharing scheme. We explicitly construct an instance of the game and look at the 
cost-sharing scheme on various subsets of this instance. We need to argue that there 
is a subset S of agents such that the total cost shares of the elements of S is small 
compared to the cost of S. This is done using the probabilistic method: we pick a 
subset S$ at random from a certain distribution and show that in expectation, the ratio 
of the recovered cost to the cost of S is low. Therefore, there is a manifestation of S 
for which this ratio is low. To bound the expected value of the sum of cost shares of 
the elements of S, we use cross-monotonicity and bound the cost share of each agent 
i € S by the cost share of i in a substructure 7; of S. Bounding the expected cost 
share of i in 7; is done by showing that for every substructure T, every i € T has the 
same probability of occurring in a structure S in which 7; = T. This implies that the 
expected cost share of i in J; (where the expectation is over the choice of S) is at most 
the cost of 7; divided by the number of agents in 7;. Summing up these values for all i 
gives us the desired bound. 

In the following, we show how this technique can be applied to the facility location 
problem to show that the factor 1/3 obtained in the previous section is the best possible. 
We start by giving an example on which the algorithm FLCostShare in Figure 15.4 
recovers only a third of the cost. This example will be used as the randomly chosen 
structure in our proof. 


Lemma 15.22 = Let TZ be an instance of the facility location problem consisting of 
m-+k agents a,...,Am,a,,...,a@, and m facilities f\,..., fm each of opening 
cost 3. For every i and j, the connection costs between f; and a; and between 
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Figure 15.5. Facility location sample distribution. 


fi and a’ are all I, and other connection costs are obtained by the triangle 
inequality. See Figure 15.5a. Then ifm = a(k) and k tends to infinity, the optimal 
solution for T has cost 3m + o(m). 


PROOF The solution which opens just one facility, say f,, has cost 3m + k + 
1 = 3m + o(m). We show that this solution is optimal. Consider any feasible 
solution that opens f facilities. The first opened facility can cover k + 1 agents 
with connection cost 1. Each additional facility can cover | additional client with 
connection cost 1. Thus, the number of agents with connection cost | is k + f. 
The remaining m — f agents have connection cost 3. Therefore, the cost of the 
solution is 3f +k+ f+3(m — f)=3m+k-+ f. As f > 1, this shows that 
any feasible solution costs at least as much as the solution we constructed. 


Theorem 15.23. Any cross-monotonic cost-sharing scheme for facility location 
is at most 1/3-budget-balanced. 


PROOF Consider the following instance of the facility location problem. There 
are k sets A,,..., Ay of m agents each, where m = w(k) and k = (1). For every 
subset B of agents containing exactly one agent from each A; (|B A;| = 1 
for all i), there is a facility fg with connection cost | to each agent in B. The 
remaining connection costs are defined by extending the metric, that is, the cost 
of connecting agent i to facility f, fori ¢ B is 3. The facility opening costs are 
all 3. 

We pick a random set S of agents in the above instance as follows: Pick a 
random i from {1,...,k}, and for every j #i, pick an agent a; uniformly at 
random from A;. Let T = {a; : j Ai} and S = A; UT. See Figure 15.5b for an 
example. It is easy to see that the set S induces an instance of the facility location 
problem almost identical to the instance Z in Lemma 15.22 (the only difference 
is that here we have more facilities, but it is easy to see that the only relevant 
facilities are the ones that are present in Z). Therefore, the cost of the optimal 
solution on S is 3m + o(m). 

We show that for any cross-monotonic cost-sharing scheme &, the average 
recovered cost over the choice of S$ is at most m + o(m) and thus conclude that 
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there is some S whose recovered cost is at most m + o(m). We start by bounding 
the expected total cost share using linearity of expectation and cross-monotonicity: 


es] De.) |=8 Y_ (a, 8) | + E} ¥ > &(aj. 8) 


aes acA; jHi 


<E] >) &@, {a}UT)| +E] >) &(@;,T) 


acA; JAI 


Notice that the set T has a facility location solution of cost 3 + k — 1 and thus by 
the budget-balance condition the second term in the above expression is at most 
k + 2. The first term in the above expression can be written as mEs ,[&(a, {a} U 
T)], where the expectation is over the random choice of S and the random choice 
of a from A;. This is equivalent to the following random experiment: From each 
Aj, pick an agent a; uniformly at random. Then pick 7 from {1, ..., k} uniformly 
at random and let a = a; and T = {a; : j i}. From this description it is clear 
that the expected value of &(a, {a} U T) is equal to t ea E(aj, {a1,..., ax}). 
This, by the budget-balance property and the fact that {a,,..., a,} has a solution 
of cost k + 3, cannot be more than 3 Therefore, 


Bs| eta 5) <m (=) +(k+2)=m+o(m), (15.6) 
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when m = w(k) and k = w(1). Therefore, the expected value of the ratio of 
recovered cost to total cost tends to 1/3. 


15.6 The Shapley Value and the Nash Bargaining Solution 


One of the problems with the notion of core in cost-sharing games is that it rarely 
assigns a unique cost allocation to a game: as illustrated in Example 15.4, the core 
of a game is often either empty (making it useless in deciding how the cost of a 
service should be shared among the agents), or contains more than one point (making it 
necessary to have a second criterion for choosing a cost allocation). In this section, we 
study a solution concept called the Shapley value that assigns a single cost allocation 
to any given cost-sharing game. We also discuss a solution concept known as the Nash 
bargaining solution for a somewhat different but related framework for surplus sharing. 
In both cases, the solution concept can be uniquely characterized in terms of a few 
natural axioms it satisfies. These theorems are classical examples of the axiomatic 
approach in economic theory. 

Both the Shapley value and the Nash bargaining solution are widely applicable 
concepts. For example, an application of the Shapley value in combination with the 
Moulin mechanism to multicasting is discussed elsewhere in this book (see Section 
14.2.2). Also, the Nash solution is related to Kelly’s notion of proportional fairness 
discussed in Section 5.12, and the Eisenberg-Gale convex program of Section 6.2. 
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15.6.1 The Shapley Value 


Consider a cost-sharing game defined by the set A of n agents and the cost function c. 
A simple way of allocating the cost c(A) among all agents is to order the agents in some 
order, say a1, a2, ..., Gn, then proceed in this order and charge each agent the marginal 
cost of adding her to the serviced set. In other words, the first agent a; will be charged 
her stand-alone cost c({a;}), the second agent a will be charged c({a, a2}) — c({ay}), 
and so on. This method is called an incremental cost sharing. 

A problem with the method described above is that it is not anonymous, i.e., the 
ordering of the agents makes a difference in the amount they will be charged. The 
Shapley value fixes this problem by taking a random ordering of the agents picked 
uniformly from the set of all m! possible orderings, and charging each agent her expected 
marginal cost in this ordering. Since for any agent i € A and any set S C A \ {i} with 
|S| = s, the probability that the set of agents that come before i in a random ordering 
is precisely S is s!(n — 1 — s)!/n!, the Shapley value can be defined by the following 
formula: 

a Gi 1 5)! 
For each agenti, @;(c) = ye i » (c(S U {i}) — c($)), 


s=0 "  SCA\(i},1SI=s 


where ¢;(c) indicates the cost share of i € A in the cost-sharing game (A, c). As the 
following example shows, the cost sharing given by the Shapley value need not be in 
the core of the game, even if the core is nonempty. 


Example 15.24 Consider the facility location game defined in Example 15.2. 
The Shapley values in this game are as follows: 


1 1 1 1 23 
da=-=x4 x3 x4 x4= 
3 6 6 3 6 
" 1 3 1 5 1 1 1 1 11 
i x x x x => 
a) 6 6 3 6 
b 1 3 1 3 1 1 1 5 7 
== x x x x2 SS 
3 6 6 3 3 


This cost allocation is not in the core of the game, since ¢, + ¢. = % >4= 


c({b, c}). This is despite the fact that, as we saw in Example 15.4, the core of this 
game is nonempty. 


However, for submodular games, it is known that any incremental cost-sharing, and 
therefore the Shapley value (which is a linear combination of incremental cost-sharing 
methods), is in the core of the game. In fact, it can be shown that in this class of games, 
the Shapley value is cross-monotone (see Exercise 15.2), making it useful in the design 
of group-strategyproof mechanisms using Moulin’s mechanism of Section 15.3.4 This 


4 It is worth noting that since the Shapley value is defined in terms of a formula comprising of exponentially many 
points of the function c(-), evaluating it is computationally hard in general. However, when the cost function c 
is submodular, random sampling can be used to approximate the Shapley values to within an arbitrary degree 
of accuracy. 
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is used in Section 14.2.2 of this book, in an application to the multicast problem. Many 
other applications of the Shapley value, as well as various generalizations (to settings 
such as NTU games or games with nonbinary demands), are extensively studied in the 
economic literature. 


15.6.2 An Axiomatic Characterization of the Shapley Value 


In his original paper, Shapley introduced what is now known as the Shapley value as 
the unique value satisfying the three properties defined below. 


Definition 15.25 Fix a set A of n agents. A value is a function that assigns to 
each cost function c a vector o(c) € IR” of nonnegative numbers. Three properties 
of values are defined as follows. 


¢ Anonymity: Changing the names of the agents does not change their cost shares. 
Formally, ¢ satisfies anonymity if for every permutation 2 of A and every cost 
function c, dz,(a(c)) = ;(c) for every i € A. 

¢ Dummy: An agent who does not add to the cost should not be charged anything. 
More precisely, if for every set S C A \ {i}, c(S) = c(S U {i}), then ¢;(c) = 0. 

¢ Additivity: For every two cost functions c; and c2, O(c; + co) = &(c1) + d(c2), 
where c; + C2 is the cost function defined by (c; + c2)(S) = ci (S$) + €2(S). 


Theorem 15.26 The Shapley value is the unique value satisfying anonymity, 
dummy, and additivity. 


The above theorem, whose proof is omitted here, is an example of the axiomatic 
method in the economic theory, whose goal is to find (or prove the nonexistence of) 
solution concepts that satisfy certain sets of desirable axioms, or characterize known 
solution concepts in terms of axioms they satisfy. Two other prominent examples of 
axiomatic results are Nash’s theorem on bargaining (presented in the next section; this 
result is considered a starting point for the axiomatic approach in economic theory), 
and Arrow’s impossibility result in the social choice literature. One example where 
this framework is applied in computer science is the axiomatic characterization of the 
PageRank algorithm for ranking Web search results. 


15.6.3 The Nash Bargaining Solution 


The bargaining problem studies a situation where two or more agents need to select 
one of the many possible outcomes of a joint collaboration. Examples include wage 
negotiation between an employer and a potential employee, or trade negotiation be- 
tween two countries. Each party in the negotiation has the option of leaving the table, 
in which case the bargaining will result in a disagreement outcome. More formally, 
a bargaining game for two players (the case of more players is similar) is given by 
a set X € R’, along with a disagreement point d € X. Each point in X corresponds 
to one outcome of the bargaining, and specifies the utility of each player for this out- 
come. The point d specifies the utility of each player for the disagreement outcome. As 
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adding or subtracting a value to the utility of an individual does not change her relative 
preferences, we assume, without loss of generality, that d = (0, 0). Furthermore, we 
assume that the set X is convex and compact. Note that convexity of X is without 
loss of generality, if an outcome is allowed to be a probability distribution over pure 
outcomes. Furthermore, we assume X contains at least one point whose coordinates 
are both positive (i.e., both parties have some incentive to negotiate). 

The above model for bargaining was first defined and studied by Nash. Note that 
an NTU cooperative game can be considered an extension of the bargaining model, 
where in addition to the outcome of individual deviations (the disagreement point), 
the outcome of group deviations are also given. Nash’s bargaining theorem gives a 
characterization of a solution for the bargaining game in terms of axioms it satisfies. 
Formally, 


Definition 15.27 A solution for the bargaining game (also known as a social 

choice function) is a function that assigns to each set X satisfying the above 

properties a single point ¢(X) € X. We define four properties of a solution as 

follows: 

¢ Pareto Optimality: $(X) is a Pareto optimal point in X, i.e., there is no point p € X 
with p > $(X), coordinate-wise. 


¢ Symmetry: If the set X is symmetric, then (X) = (u, u) for some u € R. 


¢ Scale Independence: The solution is independent of the scale used to measure 
individual utilities; ie., if X’ is obtained from X by multiplying all utilities of the 
i’th player by A;, then @(X’) can be obtained from @(X) by multiplying the i’th 
coordinate by A;. 

¢ Independence of Irrelevant Alternatives: If Y Cc X and $(X) € Y, then ¢(Y) = 
p(X). 


We now state Nash’s bargaining theorem. The proof of this theorem is simple, and is 
omitted here. 


Theorem 15.28 = There is a unique solution for bargaining games satisfying 
Pareto optimality, symmetry, scale independence, and independence of irrelevant 
alternatives. This solution assigns to each set X a point (uy, U2) maximizing u,Ud. 


Nash’s theorem gives one example of what is called a collective utility function. A 
collective utility function is a function that aggregates the utilities of individuals into 
a single number indicating the utility of the society. Classical examples of collective 
utility functions are the utilitarian function (which simply adds up the individual 
utilities), the egalitarian function (which takes the minimum of individual utilities), 
and the Nash function (which takes the product of the utilities). 


15.7 Conclusion 


In this chapter, we reviewed some of the basic notions (such as the core, 
cross-monotonicity, group-strategyproof mechanisms, Shapley value, and the Nash 
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bargaining solution) and classical results on cost and surplus sharing. We observed that 
the algorithmic questions regarding computing cost shares are closely tied to the LP 
formulation of the corresponding optimization problem, and explained how standard 
LP-based techniques developed in the field of approximation algorithms can be used 
to tackle such questions. 

There is also a potential for contributions in the other direction. For many com- 
binatorial optimization problems, thinking in terms of the cost-sharing problem (1.e., 
the dual problem) instead of the primal can shed new light on the problem. In the 
facility location example discussed in Section 15.4.2, the proof of Theorem 15.21 
gives an approximation algorithm different from the standard primal-dual algorithm 
for the problem. As it turns out, in this case the algorithm was known before, but 
Theorem 15.21 gives a new primal-dual interpretation of this algorithm. For the Steiner 
forest problem, the search for a cross-monotonic cost-sharing scheme has resulted in a 
new 2-approximation algorithm, and a stronger LP relaxation for the problem. In fact, 
for most combinatorial optimization problems, LP (15.2) is at least as strong an LP 
formulation as the standard LP relaxation; i.e., it gives at least as good a lower bound 
on the value of the optimal solution. These LPs are equivalent for some problems, as 
we saw in Section 15.2.2 for the facility location problem. However, for many other 
problems, such as the well-studied Steiner tree problem, this appears not to be the 
case. Therefore, one possible approach to obtain stronger LP relaxations (which could 
lead to better approximation algorithms) for such problems is to start from (15.2) and 
try to relax this program into one that can be solved in polynomial time. In the case 
of the Steiner tree problem, the integrality gap of LP (15.2) seems to be related to 
the long open question on the integrality gap of the bidirected LP relaxation of this 
problem. 

Another way the economic approach to cost sharing can contribute to the theory 
of algorithms is by providing new perspectives and new problems. For example, the 
axiomatic approach explained in Section 15.6 seems to be a suitable tool for studying 
properties of heuristic algorithms. One notable example is the axiomatic characteriza- 
tion of the popular web ranking algorithm PageRank. Also, the field of combinatorial 
optimization almost exclusively deals with problems whose objective is to minimize 
the total cost, or maximize the total benefit, which, according to the terminology in- 
troduced in Section 15.6.3, corresponds to the utilitarian collective utility function. 
However, the field of social choice suggests other objective functions, which can lead 
to new challenging algorithmic questions. One notable example is the Santa Claus prob- 
lem, which seeks to optimize the egalitarian objective in a simple scheduling model. 
Also, many of the algorithmic results presented in Chapter 5 for Fisher markets (where 
the Eisenberg—Gale convex program shows that the market equilibrium corresponds 
to the point maximizing the Nash collective utility function) can be viewed in this 
light. 


15.8 Notes 


Sections 15.1 and 15.2. The notion of a cooperative game was first proposed by von 
Neumann and Morgenstern (1944). The notion of the core was first introduced by 
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Gillies (1959). Theorem 15.6 was independently discovered by Bondareva (1963) and 
Shapley (1967). Theorem 15.11 is due to Scarf (1967). Deng et al. (1997) observed 
the connection between the core of many combinatorial optimization games and the 
integrality gap of the corresponding LP. Goemans and Skutella (2000) showed this 
connection for the facility location game, and proved that deciding whether the core of 
a facility location game is nonempty is NP-complete. The best lower and upper bound 
on the integrality gap of LP (15.3) are ae due to Mahdian et al. (2006), and cE due 
to Guha and Khuller (1999) See Immorlica et al. (2006) for an example of a problem 


modeled using NTU games. 


Section 15.3. For a discussion of the NPT, VP, and CS properties of cost sharing 
mechanisms see Moulin (1999) and Moulin and Shenker (2001). In our definition of 
group-strategyproof mechanisms, we did not allow side payments between members of 
a coalition. For a discussion of mechanism design in a setting where collusion with side 
payments is allowed, see Goldberg and Hartline (2005). This cross-monotonicity prop- 
erty for cost sharing is similar to the population monotonicity property introduced by 
Thomson (1983, 1995) in the context of bargaining. For cooperative games, this notion 
was first introduced by Sprumont (1990). The mechanism M; and Theorem 15.16 are 
due to Moulin (1999), where he also proves a converse to this theorem for submodu- 
lar games. Examples on the connection between group-strategyproof mechanisms and 
cost-sharing schemes, and a partial characterization of such mechanisms are due to 
Immorlica et al. (2005). 


Sections 15.4 and 15.5. For a general introduction to the primal-dual schema from the 
perspective of approximation algorithms, see the excellent book by Vazirani (2001). 
The cost-sharing scheme presented in Section 15.4 for submodular games is due to 
Dutta and Ray (1989). This scheme was formulated as a primal-dual algorithm and 
generalized to an algorithm that can increase the dual variables at different rates by Jain 
and Vazirani (2002). Both Dutta and Ray (1989) Jain and Vazirani (2002) also prove 
several fairness properties of their cost-sharing schemes. The technique of using ghost 
duals and its application to the facility location problem (algorithm in Figure 15.4) and 
single-source rent-or-buy problem are due to Pal and Tardos (2003). The proof of their 
result on the facility location problem (Theorem 15.21) is based on an algorithm that 
is originally due to Mettu and Plaxton (2000). The first (non-cross-monotonic) primal- 
dual algorithm for the facility location problem is due to Jain and Vazirani (2001). 
The probabilistic technique presented in Section 15.5 and its application to several 
problems including facility location, vertex cover, and set cover are due to Immorlica 
et al. (2005). K6nemann et al. (2007) gave a 1/2-budget-balanced mechanism, together 
with a matching upper bound for the Steiner forest problem. 


Section 15.6. The Shapley value and its axiomatic characterization (Theorem 15.26) 
are due to Shapley (1953). In the same paper, Shapley shows that for convex games 
(which correspond to submodular games in the context of cost sharing) the Shapley 
value is in the core. The application of Shapley values to the multicast problem is 
due to Feigenbaum et al. (2000) and is explained in detail in Chapter 14. For other 
applications of the Shapley value, see the book edited by Roth (1988) or the survey 
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by Winter (2002). The generalization of the Shapley value to games with nonbinary 
demand is due to Aumann and Shapley (1974). See the survey by McLean (1994) for 
various generalizations to NTU games. The result on the computation of Shapley values 
for submodular games is due to Mossel and Saberi (2006). The axiomatic result of 
Arrow is given in Arrow (1959). Axiomatic characterizations of PageRank (Page et al., 
1999) are given by Palacois-Huerta and Volij (2004) and by Altman and Tennenholtz 
(2005). We refer the reader to the excellent survey by Moulin (2002) for further 
information on the axiomatic approach to cost sharing. Theorem 15.28 is proved in 
a seminal paper by Nash (1950). See Moulin (1988) for further discussion of this 
theorem and its generalization to more than two players. See Moulin (1988, 2003) 
for a discussion of various collective utility functions and social choice rules. For 
more information on the Santa Claus problem see Bansal and Sviridenko (2006) and 
Asadpour and Saberi (2006). 
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15.1 


15.2 


15.3 


COST SHARING 


Exercises 


Consider a setting with n agents and m goods where each agent is endowed with a 
bundle of goods and a linear utility function that specifies the utility that this agent 
derives from consuming a bundle (this is the same as the linear Arrow—Debreu 
markets defined in Section 5.10 of this book). The value of a coalition S of agents 
in this model can be defined as the maximum total utility that agents in S$ can 
derive by optimally redistributing their endowments. Model this setting as a TU 
game. Does this game always have a nonempty core? 


Prove that for submodular cost-sharing games, the Shapley value is cross- 
monotone. 


In the vertex cover game, agents correspond to edges in a graph, and the cost of 
a set S of agents is the minimum size of a set of vertices that contains at least 
one of the endpoints of each edge in S. A simple primal-dual approach gives a 2- 
approximation algorithm for this problem. Modify this algorithm using the idea of 
ghost cost shares to obtain a cross-monotonic cost-sharing scheme. Find examples 
where this scheme fails to extract a constant fraction of the cost of the solution. Use 
this example, together with the technique explained in Section 15.5, to prove that 
no cross-monotonic cost-sharing scheme for this game is Q(1)-budget-balanced. 


CHAPTER 16 


Online Mechanisms 


David C. Parkes 


Abstract 


Online mechanisms extend the methods of mechanism design to dynamic environments with multiple 
agents and private information. Decisions must be made as information about types is revealed online 
and without knowledge of the future, in the sense of online algorithms. We first consider single- 
valued preference domains and characterize the space of decision policies that can be truthfully 
implemented in a dominant strategy equilibrium. Working in a model-free environment, we present 
truthful auctions for domains with expiring items and limited-supply items. Turning to a more general 
preference domain, and assuming the existence of a probabilistic model for agent types, we define a 
dynamic Vickrey—Clarke—Groves mechanism that is efficient and Bayes—Nash incentive compatible. 
We close with some thoughts about future research directions in this area. 


16.1 Introduction 


The decision problem in many multiagent problem domains is inherently dynamic 
rather than static. Consider, for instance, the following environments: 


¢ Selling seats on an airplane to buyers arriving over time. 

¢ Allocating computational resources (bandwidth, CPU, etc.) to jobs arriving over time. 

¢ Selling adverts on a search engine to a possibly changing group of buyers and with 
uncertainty about the future supply of search terms. 

e Allocating tasks to a dynamically changing team of agents. 


In each of these settings at least one of the following is true: either agents are 
dynamically arriving or departing, or there is uncertainty about the set of feasible 
decisions in the future. These dynamics present a new challenge when seeking to 
sustain good systemwide decisions in multiagent systems with self-interested agents. 

This chapter introduces the problem of online mechanism design (online MD), 
which generalizes the theory of computational mechanism design to apply to dynamic 
problems. Decisions must be made dynamically and without knowledge of future agent 
types or future decision possibilities, in the sense of online algorithms. 
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16.1.1 Example: Dynamic Auction with Expiring Items 


Consider a dynamic auction model with discrete time periods T = {1,2,...,} and 
a single indivisible item to allocate in each time period. The type of an agent i € 
{1,..., N} is denoted 6; = (a;, dj, w;) € T x T x Ryo. Agent i has arrival time a;, 
departure time d;, value w; for an allocation of a single unit of the item in some period 
t € [a;, d;], and wants at most one unit. This type information is all private to an agent. 
We refer to this as the canonical expiring items environment. 

The arrival time has a special meaning: it is the first period in which information 
about the type of this agent can be made available to the auction. (We say “can be made 
available” because a self-interested agent may choose to delay its report.) Assume 
quasi-linear utility, with utility w; — p when the item is allocated in some t € [a;, dj] 
and payment p is collected from the agent. Consider the following naive generalization 
of the Vickrey auction to this dynamic environment. 


Auction 1. A bid from an agent is a claim about its type, 6, = (Gj, a, w;), neces- 
sarily made in period t = G;. Then: in each period f, allocate the item to the highest 
unassigned bid, breaking ties at random. Collect payment equal to the second-highest 
unallocated bid in this round. 


Example 16.1 Jane sells ice cream and can make one cone each hour. 
The ice cream melts if it is not sold. There are three buyers, with types 
(1, 2, 100), 1, 2, 80), and (2, 2, 60), indicating (arrival, departure, value). Buyers 
1 and 2 are willing to buy an ice cream in either period 1 or 2 while buyer 3 will 
only buy an ice cream in period 2. In this example, if every buyer is truthful then 
buyer 1 wins in period 1 for 80, stops bidding, and buyer 2 wins in period 2 for 
60. But buyer 1 can do better. For example, buyer 1 can report type (1, 2, 61), so 
that buyer 2 wins in period 1 for 61, stops bidding, and then buyer 1 wins for 60 
in period 2. Buyer | can also report type (2, 2, 80) and delay its bid until period 
2, so that buyer 2 wins for 0 in period 1, stops bidding, and then buyer 1 wins for 
60 in period 2. 


In a static situation the Vickrey auction is (dominant-strategy) truthful because an 
agent does not affect the price it faces. But, in a sequential setting an agent can choose 
the auction in which it participates and thus choose the other agents against which 
it competes and, in turn, the price faced. In fact, if every agent was impatient (with 
d; = a;), then, prices in future periods are irrelevant and the dominant strategy is to bid 
truthfully immediately upon arrival. Note also that buyer 1’s manipulation relied on a 
suitable bid from buyer 3 in period 2 and will not always be useful. Nevertheless, this 
serves to demonstrate the failure of dominant strategy truthfulness. 


16.1.2 The Challenge of Online MD 


The dynamics of agent arrivals and departures, coupled perhaps with uncertainty 
about the set of feasible decisions in the future and in general about the state of the 
environment, makes the problem of online MD fundamentally different from 
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that of standard (offline) MD. Important new considerations in online MD are as 
follows. 


(i) Decisions must be made without information about agent types not yet arrived, coupled 
perhaps with uncertainty about which decisions will be feasible in future periods. 
(ii) Agents can misrepresent their arrival and departure time in addition to their valuation 
for sequences of decisions. Because of this agent strategies also have a temporal aspect. 
(iii) Only limited misreports of type may be available, for instance it may be impossible 
for an agent to report an earlier arrival than its true arrival. 


More generally, online MD can also model settings in which an agent’s type is 
revealed to itself over time and with its ability to learn dependent on decisions made 
by the online mechanism; e.g., a bidder needs to receive a resource to understand its 
value for the resource. 

There are two main frameworks in which to study the performance of online mech- 
anisms. The first is model-free and adopts a worst-case analysis and is useful when 
a designer does not have good probabilistic information about future agent types or 
about feasible decisions in future periods. The second is model-based and adopts an 
average-case analysis. As a motivating example, consider a search engine selling search 
terms to advertisers. This is a data-rich environment and it is reasonable to believe that 
the seller can build an accurate model to predict the distribution on types of buyers, 
including the process governing arrival and departures. 


16.1.3 Outline 


In Section 16.2 we present a general model for online MD and introduce the con- 
cept of limited misreports. Given this, we define direct-revelation, online mecha- 
nisms together with appropriate notions of incentive compatibility. Section 16.3 pro- 
vides a characterization of truthful online mechanisms in the restricted domain of 
single-valued preferences and gives detailed examples of truthful, dynamic auctions. 
These auctions are analyzed within the framework of worst-case, competitive analysis. 
Section 16.4 considers general preference domains, and defines a dynamic Vickrey— 
Clarke—Groves mechanism, that is efficient and applicable when a model is available 
and common knowledge to agents. Section 16.5 closes with open problems and future 
directions. 


16.2 Dynamic Environments and Online MD 


The basic setting assumes risk neutral agents with quasi-linear utility functions, such 
that an agent acts to maximize the expected difference between its value from a sequence 
of decisions and its total payment. Consider discrete time periods T = {1, 2, ...}, 
indexed by ¢ and possibly infinite. A mechanism makes (and enforces) a sequence 
of decisions k = (k!,k*,...) € O, with decision k' made in period t. Let kl’-?] = 
(k",...,k”). The decisions made by a mechanism can depend on messages, such as 
bids, received from agents as well as uncertain events that occur in the environment. 
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For example, in sponsored search the realized supply of search terms determines the 
feasible allocation of user attention to advertisers. 

An agent’s type, 6; = (a;, d;, w;) € ©;, where ©; is the set of possible types for 
agent i, defines a valuation function v;(6;,k) € R on a sequence of decisions k and 
is private to an agent. Time periods a;, d; ¢ T denote an agent’s arrival and departure 
period and v;(6;,k) = 0;(6;, k!%-); ie., an agent’s value is invariant to decisions 
outside of its arrival-departure window. In addition to restricting the scope of decisions 
that influence an agent’s value, the arrival period models the first period at which the 
agent is able to report its type to the mechanism. 

The valuation component w; €¢ W; of an agent’s type, where W; denotes the set 
of possible valuations, parameterizes the agent’s valuation function and can be more 
expressive than a single real number. For example, in an online combinatorial auction 
this needs to convey enough information to define substitutes (“I want item A or item B 
but not both’) or complements (“I only want item A if I also get item B’’) preferences. 
Nor does the valuation need to be constant across all periods, for instance an agent 
could discount its future value in future periods t > a; by discount factor y’“ for 
y € (0, 1). 


16.2.1 Direct-Revelation Mechanisms 


The family of direct-revelation, online mechanisms restricts the message that an agent 
can send to the mechanism to a single, direct claim about its type. For the most part we 
consider “closed” mechanisms so that an agent receives no feedback before reporting 
its type, and cannot condition its strategy on the report of another agent. 

The mechanism state, h' € H', where H' is the set of possible states in period tf, 
captures all information relevant to the decision by the mechanism in that period. Let 
@ € Q define the set of possible stochastic events that can occur in the environment, 
such as the realization of uncertain supply. This does not include the types of agents 
or any randomization within the mechanism itself. Write Q = TT,-7Q' and let w’ € Q' 
denote the information about w that is revealed in period t. Similarly, let 6° denote 
the set of agent types reported in period t. Given this, it is convenient to define 
h'=(6',...,0'@!,...,@%;k!,...,k'~'). In practice, the state will be represented 
by a small, sufficient statistic of this information. The state space H = (J, H' may be 
finite, countably infinite, or continuous. This depends, in part, on whether agent types 
are discrete or continuous. Let K(h‘) denote the set of all feasible decisions in the 
current time period, assumed finite for all 4’. Let [(h') denote the set of active agents 
in state h', i.e. those agents for which ¢ € [a,;, d;]. 


Definition 16.2 (direct-revelation online mechanism) A direct-revelation on- 
line mechanism, M = (z, x), restricts each agent to making a single claim about 
its type, and defines decision policy x = {m'}'*" and payment policy, x = {x'}'*7, 
where decision ‘(h') € K(h’) is made in state h‘ and payment x/(h') € Ris col- 
lected from each agent i € I(h’). 


Decision policy az may be stochastic. The payment policy may collect pay- 
ments from an agent across multiple periods. For notational convenience, we let 
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(0, w) = (k!, k?, ...) denote the sequence of decisions, and p;(@, w) € R denote the 
total payment collected from agent i, given type profile 0 and a realization of uncertain 
events w € Q. 


Example 16.3. Consider the canonical expiring items environment. The state 
h' can be defined as a list of reported agent types that are present in period rf, 
indicating whether each agent is already allocated or not. Decision k € K(h') 
decides whether to allocate the item in the current period to some agent that is 
present and unallocated. 


Limited misreports constrain the strategy space available to agents in direct- 
revelation, online mechanisms: 


Definition 16.4 (limited misreports) Let C(6;) C ©; for 6; € ©; denote the set 
of available misreports to an agent with true type 6;. 


In the standard model adopted in offline MD, it is typical to assume C(6;) = ©;. We 
shall assume no early-arrival misreports, with C(6;) = (6, = (4;, d;, B;) 1a; < a < 
d;, ®; € W;}; ice. agent i cannot report an earlier arrival because it does not know 
its type (or know about the mechanism) until a;. Sometimes, we shall also assume 
no late-departure misreports, which together with no early arrivals provides C(6;) = 
(0, = (a;, d;, 0;) 1a; < 4 <d; < d;, ®; € W;}. For example, we could argue that it 
is not credible to claim to have value for a ticket for a last minute Broadway show after 
5 p.m. because the auctioneer knows that it takes at least 2 hours to get to the theater 
and the show starts at 7 p.m. 

We restrict attention to mechanisms that are either dominant-strategy or Bayes— 
Nash incentive compatible. Let 6_; = (0),..., 6-1, 6i41,.-..), O-; = NjziOj;, and 
C(6_;) = Ij4iC(6;), and consider misreports 6; € C(@). 


Definition 16.5 (DSIC) Online mechanism M = (z, x) is dominant-strategy 
incentive-compatible (DSIC) given limited misreports C if 


vj (6;, (6), 0! ;, @)) — pi(6;, 0! ;, @) = v:(6;, 1(6;, 6! ;, @)) — p(B, 0! ;, ), 


for all 6; € C(6;), all 6;, all 6’, € C(6_;), all 6_; € O_;, allwe Q. 


It will be convenient to also adopt the terminology truthful in place of DSIC. The 
concept of DSIC requires that an agent maximizes its utility by reporting its true type 
whatever the reports of other agents and for all stochastic events w. When the decision 
policy itself is stochastic then DSIC requires that the expected utility is maximized 
from a truthful report, whatever the reports of other agents and (again) for all stochastic 
events w. A randomized mechanism (i.e., one with a stochastic policy) is said to satisfy 
strong-truthfulness when truthful reporting is a dominant strategy for all random coin 
flips by the mechanism, and for all external stochastic events w. 
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For Bayes—Nash incentive compatibility (BNIC), we assume in addition that all 
agents know the correct probabilistic model of the distribution on types and uncertain 
events, and that this is common knowledge. 


Definition 16.6 (BNIC) Online mechanism M=(z,x) is Bayes—Nash 
incentive-compatible (BNIC) given limited misreports C if 


E{v;(6;, 7(0;, 0-;, @)) — pi(6;, 0-i, @)} => Efv;(6;, 7@G;, 0-7, @)) — pi(G;, 6-i, @)}, 


for all 6; € C(6;), all 6;, where the expectation is taken with respect to the distri- 
bution on types 6_;, and stochastic events w, and any randomization within the 
policy. 


BNIC is a weaker solution concept than DSIC because it requires only that truth 
revelation is a best response when other agents are also truthful, and in expectation 
given the distribution on agent types and on stochastic events in the environment. 


16.2.2 Remark: The Revelation Principle 


Commonly held intuition from offline MD suggests that focusing on the class of 
incentive compatible, direct-revelation online mechanisms is without loss of generality. 
However, if agents are unable to send messages to a mechanism in periods t ¢ [a;, dj] 
then this is not true. 


Example 16.7 (failure of the revelation principle) Consider the model with 
no early-arrival misreports but allow for late-departure misreports. Consider two 
time periods T = {1, 2}, a single unit of an indivisible item to allocate in ei- 
ther period and an environment with a single agent. Denote the type of the 
agent (a;,d;, w;) with w; > 0 to denote its value for the item if allocated in 
period ¢ € [a;, d;]. Suppose that possible types are (1,1, 1) or (1, 2, 1). Con- 
sider an indirect mechanism that allows an agent to send one of messages 
{1,2} in period 1 and {1} in period 2. Let @ denote a null message. Consider 
decision policy: 2!(1) = 0, w'(2) = 1, 771, z) = 27(2, z) = 0, for z € {1, o}, 
writing the state as the sequence of messages received and decision k’ € {0, 1} 
to indicate whether or not the agent is allocated in period ¢ € {1,2}. Con- 
sider payment policy: x!(1) = x7(1, 6) = x71, 1) =0, x!(2) =3,2x72,)= 
—2.01, x?(2, ¢) =0. Type (1, 1, 1) will report message 1 in period 1 because re- 
porting message 2 is not useful and it cannot report messages (2,1). Type (1, 2, 1) 
will report messages (2,1) and has no useful deviation. This policy cannot be 
implemented as a DSIC direct-revelation mechanism because type (1, 2, 1) is 
allocated in period 1 for payment 0.99, and so type (1, 1, 1) (which is unallocated 
if truthful) will want to report type (1, 2, 1). 


The revelation principle fails in this example because the indirect mechanism 
prevents the agent from claiming a later departure than its true departure. In fact, 
the revelation principle continues to hold when misreports are limited to no-late 
departures in addition to no-early arrivals. A form of the revelation principle can 
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also be recovered by introducing simple “heartbeat” messages into a direct-revelation 
mechanism, whereby an agent still makes a single report about its type but must also 
send a noninformative heartbeat message in every period t € [4;, d;].! We leave the 
derivation of this “revelation principle plus heartbeat” result as an exercise. 

With this in hand, and in keeping with the current literature on online mechanisms, 
we will focus on incentive-compatible, direct revelation online mechanisms in this 
chapter. 


16.3 Single-Valued Online Domains 


In this section we develop a methodology for the design of DSIC online mechanisms in 
the restricted domain of single-valued preferences. We identify the central role of mono- 
tonic decision policies in the design of truthful online mechanisms. The methodology 
is illustrated in the design of a dynamic auction for two environments: (a) allocating a 
sequence of expiring items and (b) allocating a single, indivisible item in some period 
while adapting to information about agent types. Both auctions are model-free and we 
use competitive analysis to study their efficiency and revenue properties. We close the 
section with remarks that situate the study of truthful online mechanisms in the context 
of the wider mechanism design literature. 


16.3.1 Truthfulness for Single-Valued Preference Domains 


An agent with single-valued preferences has the same value, r;, whenever any of a set of 
interesting decisions is made in some period ¢ € [a;, d;], and has value for at most one 
such decision. For example, in the single-item allocation problems considered earlier 
an agent’s interesting set was all decisions that allocate an item to the agent. 

Let £; = {L,..., Lm} describe a language for defining interesting sets for agent i, 
where L C K = |), K(h), for any L € Lj, defines a subset of single-period decisions. 
Let >, be a partial order defined on £;. The valuation component w; € W; of an 
agent’s type, 6; = (a;, d;, w;), defines w; = (r;, L;) with W; = R x L;. This picks out 
the interesting set and defines the value on decisions in that set. 


Definition 16.8 (single-valued) A single-valued online domain is one where 
each agent i has a type 6; = (a;, dj, (7;, L;)), with reward r; € R and interesting 
set L; € £;, where type 6; defines valuation: 


ri, fk’ © Urry 1,.rec, L for some t € [a;, di] 
0, otherwise, 


vj(;, k) = | (16.1) 


To keep things simple, we assume that the set of interesting decisions is known by 
the mechanism and thus the private information is restricted to arrival, departure, and 
its value for a decision. We comment on how to relax this assumption at the end of 
the section. Given the known interesting-set assumption, define a partial-order <, on 


' Thanks to Bobby Kleinberg for suggesting this interpretation. 


418 ONLINE MECHANISMS 


types: 


0; Xo 02 = (ai = az) A (dy < do) A(t) S172) A (Ly = Ld). (16.2) 


This will be sufficient because we will not need to reason about misreports of interesting 
set L;. Consider the following example. 


Example 16.9 (known single-minded combinatorial auction) Multiple units 
of indivisible, heterogeneous items G, are in uncertain supply and cannot be 
stored from one period to the next. Consider single-valued preferences, where 
interesting set L; € £; has an associated bundle S(L;) C G, and characterizes 
all single-period decisions that allocate agent i bundle S(L;), irrespective of the 
allocation to other agents. Define partial order L; >; Lo = S(L,) > S(L2) for all 
L,, Lz € £;. Agent i with type 6; = (a;, d;, (r;, L;)) has value r; when decision 
k' allocates a bundle containing at least S(L;) items to the agent in some period 
t € [a;, d;]. 


The subsequent analysis is developed for deterministic policies. We adopt shorthand 
1; (6;, 9-;, w) € {0, 1} to indicate whether policy 7 makes an interesting decision for 
agent i with type 6; in some period ¢t € [a;, d;], fixing type profile 6_; and stochastic 
(external) events w € Q. Since we are often considering auction domains, we may 
also refer to an interesting decision for an agent as an allocation to the agent. The 
analysis immediately applies to the case of stochastic policies when coupled with 
strong-truthfulness.* We elaborate more on stochastic policies at the end of the section. 


Definition 16.10 (critical value) The critical-value for agent i given type 6; = 
(a;, dj, (r;, L;)) and deterministic policy z in a single-valued domain, is defined 
as 
: min; s.t. 1;(6/,0_;,@) =1 for 6! = (aj, dj, (r;, Li)) 
V(a;,d;,L;)6O-i» o)= : loos 
oo,  ifno such; exists, 
(16.3) 


where types 6_; and stochastic events w € are fixed. 


Definition 16.11 (monotonic) Deterministic policy z is monotonic if (7;(6;, 
6-1,@) =1) Ai > VG,.4,,.1)0-i, ©) > Mi (6;, 9-1, @) = 1) forall 6; >» 6;, for 
all 6_;,allwae Q. 


The “strict profit” condition, r; > VG, i 1, 9-i; @), is added to prevent weak in- 
difference when 6/ >» 6; and rj = r;, and is redundant when r/ > r;. Say that an 
arrival-departure interval [a}, d/] is tighter than [a;, d;] if a; > a; and d/ < d;, and 
weaker otherwise. 


2 Tt is convenient for this purpose to consider the random coin flips of a policy as included in stochastic events w 
so that no notational changes are required. 
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Lemma 16.12 The critical value to agent i is independent of reward r; and 
(weakly) monotonically increasing in tighter arrival—departure intervals, given a 
deterministic, monotonic policy. 


PROOF Fix some 6_;,@ € &2. Assume for contradiction that 0/ <» 6; so that a; > 
a; andd/ < d;, but Via d,1,)O-i> W) < VG, d:L;)(9-i> ©). Modify the reward of type 
6! = (aj, dj, (r/, L;)) such that rj := Val d!,L,) 0-1 @) and modify the reward of 
type 6; = (ai, dj, (7;, L;)) such that r; := Via .d!,L,)O-i> w). Now, we still have 
0! Xo 6;, but 77;(0/, 0_;, w) = 1 while 7(6;, 6_;, ) = 0 and a contradiction with 
monotonicity. 


Theorem 16.13. A monotonic, deterministic decision policy a can be truthfully 
implemented in a domain with known interesting set single-valued preferences, 
and no early-arrival and no late-departure misreports. 


PROOF Define payment policy x/(h') = 0 for all t 4 dj, and with 


xi(h') = Via, dy.L,)O-#? o), if mi, 6_i, o) =1 (16.4) 
0, otherwise 

when t =dj. This critical-value payment is collected upon departure. Fix 6_;, 
6; = (a;, d;,(7;, Li)), and w € Q, assume that agent i is truthful, and proceed 
by case analysis. (a) If agent i is not allocated, Va, di, ,)(9-i: @) > r; and to be 
allocated, the agent must report some 6/ > 6;, which it can only do with a report 
6! = (a;, dj, (r/, L;)), and r} > r;, by limited misreports. But since the critical 
value is greater than its true value r;, it will have negative utility if it wins for r/. 
(b) If agent i is allocated, its utility is nonnegative since Vin. ds, L)9-i; @) <r; and 
it does not want to report a type for which it would not be allocated. Consider any 
report 0; € C(6;) for which the agent continues to be allocated. But, the critical 
value for 6/ is (weakly) greater than for 6; since it is independent of the reported 
reward r; and weakly increasing for an alternate arrival—departure interval since 
it must be tighter by limited misreports, and then by appeal to Lemma 16.12. 


We turn now to identifying necessary conditions for truthfulness. An online mech- 
anism satisfies individual rationality IR) when every agent has nonnegative utility 
in equilibrium. This is required when agents cannot be forced to participate in the 
mechanism. 


Lemma 16.14 (critical payment) Jn a (known interesting set) single-valued 
preference domain, any truthful online mechanism that is defined for a determin- 
istic decision policy and satisfies IR must collect a payment equal to the critical 
value from each allocated agent. 


PROOF Fix 6_; and w€ Q. Payment p;(6;,6_;,@), made by agent i con- 
tingent on successful allocation, cannot depend on reward 7; because if 
Di(Oj, Oi, @) < pi(O;, O-i, w) for 6; = (aj, dj, (r;, Li)) and 6/ = (a;, dj, (r;, Li)) 
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and r; #7; and min(r/, r;) > Uva;.d;.L;)O-i> ®) then an agent with type 6; should 
report type 0;. Fix type 6; such that 7;(0;, 0_;, @) = 1. Now, if p;(6;, 0_;, @) < 
Uva;.d;,L;)O-i> ©) then an agent with type 6! = (a;, dj, (r;, L;)) and p;(6;, 0_;, @) < 
1 < U6, a).L,)9-i» @) should report 6;. This is possible even with negative pay- 
ment p;(6;, 9_;, @) as long as rewards can also be negative. On the other hand, if 
Via, dL, 69-1, ®) < pilGi, O-i, @) then the mechanism fails IR for an agent with 
type 6; = (aj, di, (r;, Li)) and uG, 4.1,)(0-i, ©) <1} < pi, 9-i, @). 


Say that a domain satisfies reasonable misreporting when an agent with type 6; has 
available at least misreports 6/ € C(6;) with a) > a;, d} < d; and any reward r;. 


Theorem 16.15 = /n a known interesting set single-valued preference domain 
with reasonable misreporting, any deterministic policy a that can be truthfully 
implemented in an IR mechanism that does not pay unallocated agents must be 
monotonic. 


PROOF Fix 6_;, w€Q. Assume, for contradiction, that 6; <g 6; with 
0; = (aj, d;, (7j,-£;)) and 6; = (a, d;, (7;, £:)),. but 2;(0;, 0_j;, @) = 1, value 
rj > Ui di.L;)6O-i> ©) and 7;(6/,0_;,@) =0. We must have p;(6;,6_;,@) = 
Ula;,d;,L, SO-i» ®) by Lemma 16.14. Thus, agent i with type 6; must have strictly 
positive utility in the mechanism. On the other hand, the agent with type 0/ > 6; 
is not allocated, makes nonnegative payment, and has (weakly) negative utility. 
But, an agent with type @/ can report 6;, which presents a contradiction with 
truthfulness. 


The restriction that losing agents do not receive a payment plays an important role. 

To see this, consider a domain with no late-departure misreports, fix 6_;, and con- 
sider a single-item valuation with possible types ©; = {(1, 1, $10), (1, 2, $10)}. Policy 
wi ((1, 1, $10), 6_;) = land z;((1, 2, $10), 6_;) = 0 is nonmonotonic, but can be truth- 
fully implemented with payments p;((1, 1, $10), 0-;) = 8 and p,((1, 2, $10), 6_;) = 
—100. 
Monotonic-Late. Theorem 16.13 can be generalized to a domain with arbitrary mis- 
reports of departure. For a particular 6_;, @ € Q and type 6; = (a;, d;, (r;, L;)), define 
the critical departure, das, a 1, )9-i: w), as the earliest departure d! < d; for which 
Vee dl, 1) 0-1 @) = Va, ds, 1, (9-i @). This is the earliest departure time that agent i 
could have reported without increasing the critical value. Given this, we say that policy 
x is monotonic-late if it is monotonic and if no interesting decision is made for agent i 
before its critical departure period. A monotonic-late, deterministic decision policy 
can be truthfully implemented in a domain with no early-arrival misreports but arbitrary 
misreports of departure. Moreover, this requirement of monotonic-late is necessary for 
truthfulness in this environment. 


16.3.2 Example: A Dynamic Auction with Expiring Items 


For our first detailed example we revisit the problem of selling an expiring item, such 
as ice cream, time on a shared computer, or network resources, to dynamically arriving 
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buyers. This is the canonical expiring items environment. Assume for notational con- 
venience that the time horizon is finite. We design a strongly truthful online auction 
that includes random tie-breaking and satisfies monotonicity (however ties are broken). 

We assume no early-arrival and no late-departure misreports. The no late-departure 
assumption can be readily motivated in physical environments. For ice cream, think 
about a tour group that will be leaving at a designated time so that it is not credible to 
claim a willingness to wait for an ice cream beyond that period. For network resources, 
such as an auction for access to WiFi bandwidth in a coffee house, think about requiring 
a user to be present for the entire period of time reported to the mechanism. A technical 
argument for why we need this assumption is also provided below.° 


Competitive analysis. We perform a worst-case analysis and consider the performance 
of the mechanism, given a sequence of types that are generated by an “adversary” whose 
task it is to make the performance as bad as possible. Of particular relevance is the 
method of competitive analysis, typically adopted in the study of online algorithms. 
The following question is asked: how effectively does the performance of the online 
mechanism “compete” with that of an offline mechanism that is given complete infor- 
mation about the future arrival of agent types? This question is asked in the worst-case, 
for an adversarially defined input. 

Competitive analysis is most easily justified when the designer does not have a good 
model of the environment. As a motivating example, consider selling a completely 
new product or service, for which it is not possible to conduct market research to 
get a good model of demand. Competitive analysis can also lead to mechanisms that 
enjoy good average-case performance in practice, provide insight into how to design 
robust mechanisms, and produce useful “lower-bounds.” A lower-bound for a problem 
makes a statement about the best possible performance that can be achieved by any 
mechanism. Online mechanisms are of special interest when their performance matches 
the lower bound. 

In performing competitive analysis, one needs to define: an optimality criterion; 
a model of the power of the adversary is selecting worst-case inputs; and an offline 
benchmark, defined with perfect information about the future. We are interested in the 
efficiency of a dynamic auction for expiring items and adopt as our optimality criterion 
the value of the best possible offline allocation. This can be computed as follows: 


N 

V0) = ;W; 16.5 

(6) oe (16.5) 
d; 

Siti. Saye, Vie (LisenN} (16.6) 
t=d; 

yy aes, Veer, (16.7) 
i:té[a;,d;] 


3 The requirement of no late departures can be dispensed with, while still retaining truthfulness, in environments 
in which it is possible to schedule a resource in some period before an agent’s reported departure, but withhold 
access to the benefit from the use of the resource until the reported departure; e.g., in grid computing, jobs can 
run on the machine but the result then held until reported departure. 
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where y; € {0, 1} indicates whether bid i is allocated and x;, € {0, 1} indicates the 
period in which it is allocated.* For our adversarial model, we consider a powerful 
adversary that is able to pick arbitrary agent types, including the value, arrival, and 
departure of agents. 

Let z € Z denote the set of inputs available to the adversary and 0, the corresponding 
type profile. Let Val(z(0,)) denote the total value of the decisions made by policy z 
given input 6,. An online mechanism is c-competitive for efficiency if 


ea | mee) an aes 
zez V*(0,) 


Cc 


for some constant c > 1.Sucha mechanism is guaranteed to achieve within fraction i of 
the value of the optimal offline algorithm, whatever the input sequence. The expectation 
allows for stochastic policies and can also allow for the use of randomization in defining 
the power of the adversary (we will see this in the next section). Competitive ratio c is 
referred to as an upper-bound on the online performance of the mechanism. 

Now consider the following modification to Auction 1: 


Auction 2. A bid from an agent is a claim about its type, 6, = (ij, ah w;), neces- 
sarily made in period t = dj. 


(i) In each period, f, allocate the item to the highest unassigned bid, breaking ties at 
random. 

(ii) Every allocated agent pays its critical-value payment, collected upon its reported 
departure. 


The auction is the same as Auction 1 except for the payment rule, which now charges 
the critical value rather than the second price in the period in which an agent wins. We 
refer to this as a “greedy auction” because the decision policy myopically maximizes 
value in each period. When every bidder is impatient, then the auction reduces to a 
sequence of Vickrey auctions (i.e., Auction 1.) 


Example 16.16 Consider the earlier example, with three agents and types 0; = 
(1, 2, 100), 02 = (1, 2, 80), and 63 = (2, 2, 60), and one item to sell in each period. 
Suppose that all three agents bid truthfully. The greedy allocation rule sells to 
agent 1 in period 1 and then agent 2 in period 2. Agent 1’s payment is 60 because 
this is the critical value for arrival—departure (1, 2), given the bids of other agents. 
(A bid of just above 60 would allow the agent to win, albeit in period 2 instead of 
period 1.) Agent 2’s payment is also 60. 


Theorem 16.17 Auction 2 is strongly truthful and 2-competitive for efficiency 
in the expiring-items environment with no early-arrival and no late-departure 
misreports. 


4 Note that the integer program allows the possibility of allocating more than one item to a winning bid but that 
this does not change the value of the objective and is not useful. 


SINGLE-VALUED ONLINE DOMAINS 423 


PROOF Suppose that random tie-breaking is invariant to reported arrival and 
departure. The auction is strongly truthful because the allocation function is 
monotone: if agent i wins in some period ¢ € [a;, d;] then it continues to win 
either earlier or in the same period for w; > w;, and for a} < a; or d; > d;. For 
competitiveness, consider a set of types @ and establish that the greedy online 
allocation rule is 2-competitive by a charging argument. For any agent 7 that 
is allocated offline but not online, charge its value to the online agent that was 
allocated in period t in which agent i is allocated offline. Since agent i is not 
allocated online, it is present in period t, and the greedy rule allocates to another 
agent in that period with at least as much value as agent i. For any agent i that is 
allocated offline and also online, charge its value to itself in the online solution. 
Each agent that is allocated in the online solution is charged at most twice, and 
in all cases for a value less than or equal to its own value. Therefore the optimal 
offline value V*(@) is at most twice the value of the greedy solution. 


We now understand that the decision policy in Auction 1 was monotonic but that 
Auction 1 was not truthful because the payments were not critical-value payments. 

It is interesting to note that there is a 1.618-competitive online algorithm for this 
problem. However, this algorithm is not monotonic and cannot be implemented truth- 
fully. In fact, we have a tight lower bound for the problem of achieving efficiency and 
truthfulness. 


Theorem 16.18 No truthful, IR, and deterministic online auction can obtain a 
(2 — €)-approximation for efficiency in the expiring items environment with no 
early-arrival and no late-departure misreports, for any constant € > 0. 


PROOF Fix € >0, consider T = {1,2} and construct the following three 
scenarios: (i) Consider agents 0; = (1,1, ¢g(1 + 45)), 6 = (1, 2, g), and choose 
0 <6 < ;{s0 that rae < a and the auction must allocate to both agents to 
be (2 — €)-competitive. Let g > UG 1) (92) (dropping dependence on w because 
there are no stochastic events to consider), so that agent | must have strictly 
positive utility since the price is independent of reported value (for truth- 
fulness) and less than or equal to Gq 1)(9-1) for IR. Gi) As in (i) except 
6, > 6; = (1,2, gq. + 4)) and a new type 63 = (2, 2, oo) is introduced. Agent 
1 must be allocated else it can report type 6;. Moreover, agent 1 must be al- 
located in period 1 because otherwise the mechanism cannot compete when 63 
arrives. Agent 2 is not allocated. (iii) As in (i) except 6; > 6; = (1, 2, g(1 + 4)) 
and 0, — 65 = (1, 1, g). The auction must allocate to both agents to be (2 — €)- 
competitive. Further assume that q > v(_,)(@;), which is without loss of generality 
because if g = vE,1)(9) then we can repeat the analysis with g’ = ag fora > 1 
replacing g throughout. But now agent 2 with type 65 has strictly positive utility 
since its payment is no greater than its critical value and the auction is not truthful 
in scenario (ii) because agent 2 can benefit by deviating and reporting 0}. 


The following provides a technical justification for why the no late-departure mis- 
reports assumption is required in this environment. 
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Theorem 16.19 No truthful, IR, and deterministic online auction can obtain a 
constant approximation ratio for efficiency in the expiring items environment with 
no early-arrival misreports but arbitrary misreports of departure. 


PROOF Consider M periods. Fix 6_j. Fix v/, (6-1) < 00 (dropping depen- 
dence on w because there are no stochastic events to consider). First show that 
any agent with type 6; = (1, M, w;) for w; > va, my(9-i) must be allocated in pe- 
riod 1. For this, first show that v(, y)(0-1) = UG, 1)(@-i). Construct 6/ = (1, M, w}) 
with w; = v1) + €, some € > 0. By truthfulness and thus monotonicity we have 
Vqm)(O-i) < vG,1)(@-i) and agent 7 must be allocated. Moreover, it must be al- 
located in period 1 else an adversary can generate M — 1 bids {(t, t, B’~')} for 
large 6B > O and t € {2,..., M}, all of which must be accepted for the auction 
to be constant competitive. But in this case the agent should deviate and re- 
port (1, 1, w;), and be allocated in period | with payment Gq,(6-1) < w; and 
have positive utility. Since type (1, M, w;) is allocated in period 1, we must 
have v(, y)(6-i) = v¢,1)(@-1) by truthfulness and the critical-payment lemma 
else type (1, 1, w;) can deviate and report (1, M, w}) and do better. Consider 
again type (1, M, w;), we now have w; > vq, my)(6-i) >wi> vq1) (6-1) and the 
agent must be allocated in period 1. To finish the proof, now construct type pro- 
file@ = {1, M,qi),...,, M@, qu)} with qi, ..., Gm unique values drawn from 
[q,q + 46] for some g > 0 and 6 > 0. For any i, we must have vj, ,)(6-;) < 00 
else the mechanism is not competitive because the adversary could replace 
type 7 with 6/ = (1, 1, w/’) and some arbitrarily large w/’. We can also assume 
gi = va, my(9-i) => qi > UG, my (9-i), which can be achieved by a slight upward 
perturbation of any value gj = v(; y)(6-i). Finally, the online mechanism can 
allocate at most one of these bids since any bid allocated must be allocated in 
period 1 and can achieve value at most g + 6 while the efficient offline allocation 
has value V*(6) > Mq. Thus, no constant approximation is possible because M 
can be selected to be arbitrarily large. 


16.3.3 Example: An Adaptive, Limited-Supply Auction 


For our second detailed example, we consider an environment with a single, indi- 
visible item to be allocated to one of N agents. Each agent’s type is still denoted 
6; = (a;,d;, w;) € T x T x Ryo, with w; denoting the agent’s value for the item. This 
fits into the known interesting-set model. We assume no early-arrival misreports but 
will allow arbitrary misreports of departure. Our goal is to define an auction with good 
revenue and efficiency properties. We will work with a weaker adversarial model than 
in the setting with expiring items. 

We relate this dynamic auction problem to the classical secretary problem, a well- 
studied problem in optimal stopping theory: 


The Secretary Problem. An interviewer meets with each from a pool of N job appli- 
cants in turn. The total number of applicants is known. Each applicant has a quality 
and the interviewer learns, upon meeting, the relative rank of each applicant among 


SINGLE-VALUED ONLINE DOMAINS 425 


those already interviewed and must make an irrevocable decision about whether or not 
to hire the applicant. The goal is to hire the best applicant. By the “random-ordering 
hypothesis,” an adversary can choose an arbitrary set of N qualities but cannot control 
the assignment of quality to applicant, rather this is sampled uniformly at random 
and without replacement from the set. The problem is to design a stopping rule that 
maximizes the probability of hiring the highest rank applicant, in the worst case for 
all possible adversarially selected inputs. Say that a candidate is the most qualified 
of all applicants seen so far. The optimal policy (i.e., the policy that maximizes the 
probability of selecting the best applicant, in the worst case) is to interview the first 
t — 1 applicants and then hire the next candidate (if any), where ¢ is defined by 


1 1 
Pier aD Bere (16.9) 


For instance, with N = 10,000 the optimal t is 3,680, i.e., sample 3,679 applicants and 
then accept the next candidate. As N — ov, the probability of hiring the best applicant 
approaches 1/e, as does the ratio t/N, and the optimal policy in this big N limit 
is to sample the first |N/e] applicants and then immediately accept any subsequent 
candidate. 


We can reinterpret the secretary problem in the auction context. Bidders, unlike the 
applicants in the classic model, are strategic and can misrepresent their value and time 
their entry into the market. Bidders also have both an entry and an exit time. We modify 
the adversarial model in the secretary problem while retaining the random-ordering 
hypothesis: an adversary picks a set of values and a set of arrival—departure intervals and 
agent types are then defined by sampling uniformly at random and without replacement 
from each set.> 

In addition to efficiency, we will also consider revenue as an optimality criterion. 
The auction’s revenue for type profile @ is defined as Rev(p(@)) = >); pi(@), where 
notation p;(@) denotes the (expected) payment by agent i given type profile 0. Notation 
@ € Q is suppressed because there are no external stochastic events in the problem. 
For an offline benchmark we consider the revenue from an offline Vickrey auction and 
define R*(@) as the second-highest value in type profile 6. An online mechanism is 
c-competitive for revenue if 


16.1 
min a (16.10) 


; {Aeron} | 
“[ RO) J = 


where z € Z is the set of inputs available to an adversary, in this case choosing the 
two sets described above, and the expectation here is taken with respect to the random 
choice of the sampling process that matches values with arrival—departure intervals. 
As we have seen, the optimal policy for the secretary problem has a learning 
phase followed by an accepting phase. For a straw-man online auction interpreta- 
tion, consider: observe the first |N/e| reports and then price at the maximal value 
received so far, and sell to the first agent to subsequently report a value greater than 


5 By an averaging argument, our results for randomly ordered inputs imply the same (upper-bound) competitive- 
ratio analysis when the bids consist of i.i.d. samples from an unknown distribution. 
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this price. Break ties at random. The following example shows that this fails to be 
truthful. 


Example 16.20 Consider six agents, with types 6; = (a;,d;,w;) and 0; = 
(1, 7, 6), 02 = (3, 7, 2), 03 = (4, 8, 4), 64 = (6, 7, 8), and agents 5 and 6 arriving 
in later periods. The transition to the accepting phase occurs after |6/e| = 2 bids. 
Agent 4 wins in period 6 and makes payment 6. If agent | reports 6; = (5, 7, 6), 
then it wins in period 5, for payment 4. 


The auction is truthful when all agents are impatient (a; = d;) but fails to be truthful 
in the general setting with patient agents because the allocation policy is not monotonic 
with respect to arrival time. Consider instead the following simple variation. 


Auction 3. A bid from an agent is a claim about its type, 6, = (Gj, di, w;), neces- 
sarily made in period t = Gj. 


(i) (Learning): In period t in which the |N/e]th bid is received let p > q be the 
top two bid values received so far. 
(ii) (Transition): If an agent bidding p is still present in period t then sell to that 
agent (breaking ties at random) at price q. 
(iii) (Accepting): Else, sell to the next agent to bid a price at least p (breaking ties at 
random), collecting payment p. 


Theorem 16.21 Auction 3 is strongly truthful in the single-unit, limited supply 
environment with no early-arrival misreports. 


PROOF Assume that the method used to break ties is independent of the reported 
departure time of an agent. Fix 6_;. Monotonicity is established by case analysis 
on type 6;: (a) If d; is to the left of the transition, the agent is not allocated 
and monotonicity trivially holds. (b) If [a;, dj] spans the transition, agent i does 
not trigger the transition, and it wins with w; > q then there is no tie-breaking 
and the agent continues to win for an earlier arrival or later departure (because 
this changes nothing about the price it faces when the transition occurs), and 
continues to win with a higher value. (c) If arrival, a;, is after the transition and 
agent 7 wins with w; > p (and perhaps winning a random selection over another 
agent j arriving in the same period also with w; > p) then it continues to win 
with an earlier arrival (even one that occurs before the transition because its value 
will define p), with a later departure (because tie-breaking is invariant to reported 
departure) and with a higher value. (d) If the agent triggers the transition and 
wins with w; > q then its value w; = p, there was no tie to break, and the agent 
continues to win for an earlier arrival (although at some point the transition will be 
triggered by the next earliest agent to arrive), for a higher value, and is unaffected 
by a later departure. The payment is the critical value, namely gq in case (b) and (d) 
and p in case (c). Moreover, the policy is monotonic-late: in case (b) the critical 
value is infinite for all departures before the transition but constant with respect 
to departure otherwise and the critical departure period is that of the transition; in 
cases (c) and (d) the critical value payment is independent of departure time and 
the critical departure period is equal to the arrival period. 
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Example 16.22 Return to the earlier example with six agents and types 6; = 
(1, 7, 6), = (3, 7, 2), 63 = (4, 8, 4), 64 = (6, 7, 8), with agents 5 and 6 arriving 
in later periods. The transition to the accepting phase occurs upon the arrival 
of agent 2. Then p = 6, g = 2, and agent 1 wins for 2. Consider instead that 
6; = (1, 2, 6). The transition still occurs upon the arrival of agent 2 but now the 
item is sold in period 6 to agent 4 for a payment of 6. An agent with true type 6; does 
not want to report 6, because of the monotonic-late property: although it would 
win, it would not be allocated until period 3, and this is after its true departure. 


Theorem 16.23 Auction 3 is e + o(1)-competitive for efficiency and e* + o(1)- 
competitive for revenue in the single-unit, limited supply environment in the limit 
as N > w. 


PROOF Let t = [N/e]. For efficiency, our competitive ratio is at least as great 
as the probability of selling to the highest value agent. Conditioned on selling at 
the transition, the probability that we sell to the highest value agent is at least 
Nie) = 1/e — o(1). Conditioned on selling after the transition, the probability of 
this event is 1 /e — o(1) according to the analysis of the classical secretary problem. 
For revenue, our competitive ratio is at least as great as the probability of selling 
to the highest value agent at a price equal to the second-highest bid. Conditioned 
on selling at the transition, the probability of this event is (1/ e)* — o(1) (ie., the 
probability that both the highest and second-highest value agents arrive before 
period t). Conditioned on selling after the transition, the probability of this event 
is (1/e)(1 — 1/e) — o(1), ie., the probability that the second-highest value agent 
arrives before t and the highest value agent arrives after t. The unconditional 
probability of selling to the highest value agent at the second-highest price is a 
weighted average of the two conditional probabilities computed above, hence it 
is at least (1/e)? — o(1). 


The random-ordering hypothesis has a critical role in this analysis: there is no 
constant competitive mechanism in this environment for the adversarial model adopted 
in our analysis of the expiring items environment. 

For the secretary problem it is well known that no stopping rule can achieve asymp- 
totic success probability better than 1/e. The same lower bound can be established in 
our setting, even though the mechanism has richer feedback (i.e., it sees numbers not 
ranks) and even though an allocation to some bidder other than the highest-rank bidder 
will contribute to expected efficiency. The proof of this result is beyond the scope of 
this chapter.® 


16.3.4 Remarks 


We end this section with some general remarks that mostly seek to place the study 
of online mechanisms in single-valued preference domains in the broader context of 
computational mechanism design. 


© One shows that for any stopping rule there is some distribution that is hard in the sense that the second-highest 
value in the sequence is much less than the highest value with high probability. Given this, the expected efficiency 
ratio of the allocation is determined, to first order, by the probability of awarding the item to the highest bidder. 
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Ex-post IC. A mechanism is ex-post IC if truth revelation is a best-response con- 
tingent on other agents being truthful, and whatever the types of other agents (and 
thus for all possible futures in the context of online MD). In offline mechanisms the 
solution concepts of ex-post incentive compatible (EPIC) and DSIC are equivalent 
with private value types. This equivalence continues to hold for closed online mecha- 
nisms that provide no feedback to an agent before it submits a bid. However, an online 
mechanism that provides feedback, for instance prices, or in an extreme case reports 
of current bids, loses this property. The report of an agent can now be conditioned 
on the reports of earlier agents, and monotonicity provides EPIC but not necessarily 
DSIC. Consider again Auction 2 in the expiring items environment, with true types 
6, = C1, 2, 100), 6. = (1, 2, 80), and 63 = (2, 2, 60). If the bids are public then a pos- 
sible (crazy) strategy of agent 3 is to condition its bid as possible: “bid (2, 2, 1000) if a 
bid of (1, 2, 100) is received or bid (2, 2, 60) otherwise.” Agent 1 will now pay 60 if it 
bids truthfully, but would pay 60 with a bid of (1, 2, 90). Nevertheless, truthful bidding 
is a best response when other agents bid truthfully. 


Simple price-based online auctions. One straightforward method to construct 
truthful online auctions for known-set, single-valued environments is to define 
an agent-independent price schedule qj(L,6_;,w) € R to agent i for interesting 
decision set L€Z£;, given stochastic events w € Q, where q/(L,6_;,@) de- 
fines the price for a decision in set L in period ¢. Given this, define payment 
Poar.d;,L;)(O-i, ©) = MiNyefa;,4,1 Gj (Li, @-i, @) and let 1%, 4, :,)(@-i, @) denote the first 
period t € [a;, d;] in which q/(L;, 9_-;, ®) = Pya;.a,,L;)(O_i, @). Then, decision policy 7 
that allocates to agent i with type 6; = (a;, dj, (r;, L;)) if and only ifr; > q/(Li, 6_;, @) 
in some ¢ € [a;, d;], with the allocation period t > ta, di, L(9-i @), is monotonic-late 
and the associated critical-value payment is just Pya,,d,,L;)(@-i, @). Working with price 
schedules is quite natural in many domains, although not completely general, as shown 
in the following example: 


Example 16.24 Consider the canonical expiring items environment. Fix 
0_;, and consider a monotonic-late policy 2 with critical-value UG 2)(9-i) = 
20, UG 1)(9-i) = U(0,2)(6-i) = 30 (dropping dependence on w because there are 
no stochastic events to consider). This policy allocates to type 6; = (1, 2, 25) in 
period 2 but not type 6/ = (1, 1, 28) or 6/(2, 2, 28). No simple price schedule 
corresponds to this policy, because it would require qj (6_i) > 28, q; (6_i) > 28 
but min(q/(6_;), g?(6_:)) < 25. 


The role of limited misreports. Consider again the above example. The price on an 
allocation to agent i in period 2 depends on its report: if the agent’s type is 6; = (2, 2, w;) 
then the price is 30 but if the agent’s type is 6; = (1, 2, w;) then the price is 20. This 
is at odds with the principle of “agent-independent prices” that drives the standard 
analysis of truthful mechanisms. The example also fails weak-monotonicity, which is 
generally necessary for truthfulness.’ 


7 A social choice function f : © — O satisfies weak monotonicity if and only if for any 6; € @;, agent i, and 
6_; € O_;, then f(6;, 0_-;) =a and f(/, 0_;) = b implies that v;(b, 6/) — vj (b, 6) = ui(a, 6/) — v;(a, 9). In 
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What is going on? In both cases, the reason for this departure from the standard theory 
for truthful mechanism design is the existence of limited misreports. The auction would 
not be truthful with early-arrival misreports because an agent with type (2, 2, 28) could 
usefully deviate and report (1, 2, 28). For limited misreports C(6;) C ©, that satisfy 
transitivity (which holds for the no-early arrival and no-late departure assumptions that 
are motivated in online MD), so that 6’ € C(6;) and 6’ € C(6/) implies 6” € C(6;), the 
payment p;(k, 0;, 0_;, w) collected from agent i conditioned on outcome k € O, must 
satisfy p;(k, 0, 0_;,@) = min{p;(k, 6;, 0_;, @) : 6; € C(6;), 1(0;, 6_;, w) = k}, or co 
if no such 6; exists, for all i, all k € O and all w € Q. Limited dependence on the 
reported type is possible as long as the price is independent across available misreports. 
For unlimited misreports we recover the standard requirement that prices are agent- 
independent. 

So, the temporal aspect of online MD is both a blessing and a curse: on one hand we 
can justify limited misreports and gain more flexibility in pricing and in the timing of 
allocations, on the other hand decisions must be made in ignorance about future types. 


Relaxing the known interesting-set assumption. We assumed that the interesting set 
L; € £; was known by the mechanism. Domains in which the interesting set is private 
information to an agent can be handled by making the following modifications to the 
framework: 


(i) Require that agent i’s domain of interesting sets £; = {L1,..., Lm}, defines disjoint 
sets so that L; MN Lo = @ for all Ly, Lo € L;. 
(ii) Require that a decision policy z is minimal so that it never makes decision k‘ € L for 
some L >, L; in some period t € [a;, d;], given reported type 6; = (a;, dj, (“;, L;)). 
(iii) Extend the partial-order so that 


0) Xo 2 = (a1 = a2) N (di S da) A(T) S12) A (Li = 1 L2), (16.11) 
and adopt this partial order in defining monotonicity. 


Given these modifications, the general methods developed above for the analysis of 
online mechanisms continue to hold. For instance, a monotonic, minimal, and deter- 
ministic policy continues to be truthful when combined with critical-value payments, 
and monotonicity remains necessary for truthfulness amongst minimal, deterministic 
policies. This is left as an exercise. 

The requirement that interesting sets are disjoint can significantly curtail the general- 
ity of preference domains that can be modeled. It is especially hard to model substitutes 
preferences, for instance indifference across a set of items. Suppose that the items are 
fruit, with G = {apple, banana, pear, lime, lemon}. With known interesting sets, 
we can model an agent with a type that defines a value for receiving an item from any 
subset of the domain G. With unknown interesting sets, we must now assume that there 
is some partition, for instance into {{apple, pear}, {banana}, {lime, lemon}} so that 
the agent has either the same value for an apple or a pear and no value for anything 


the example, when agent i changes its type from (1, 2, 25) to (2, 2, 28) it increases its relative value for an 
allocation in period 2 over no allocation, but the decision policy switches away from allocating to the agent in 
period 2. 
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else, or value for a banana and no value for anything else, or value for a lime and a 
lemon but no value for anything else. 


Stochastic policies. Stochastic decision policies can be important, both algorithmically 
(many computational methods for online decision use a probabilistic model to sample 
possible state trajectories) and also to allow for tie breaking while retaining anonymity. 

So far we have handled this by requiring strong-truthfulness. More generally, a 
stochastic mechanism is DSIC when truthful reporting maximizes expected utility for 
an agent (with the expectation defined with respect to randomization in the policy), 
and for all reports of other agents, and all external stochastic events, w € Q. To handle 
this, we now 7;(6;, 0_;, w) € [0, 1] to denote the probability that agent i receives an 
interesting decision (“is allocated’’), given type 6;, types 6_; and (external) stochastic 
events w. The appropriate generalization of monotonicity to stochastic policies requires, 
for every 0; = (a;, d;, (r;, L;)), all 6_;, all m € Q, that 


Tr; ((Gj, dj, (ri, Li)), 9-1, @) Zz m;((a;, d;, (r;, Li)), 0-1, @), Vr; = i; (16.12) 


and 


ri 


[raid 04,0) av> [ mi((a;,d;,(x, Li)),0-i,@) dx, (16.13) 


for all a; > a;, d/ < d;. The critical value payment becomes 


r; 


Vaid lr:.L))O-i» ©) = TiO, ©); — i mj((a;,d;,(x, Li)),0-i,@)dx (16.14) 
These definitions of monotonicity and critical-value payment reduce to the earlier cases 
when the policy is deterministic. 


Theorem 16.25 A stochastic decision policy x can be implemented ina truthful, 
IR mechanism that does not pay unallocated agents in a domain with (known 
interesting set) single-valued preferences and no early-arrival or late-departure 
misreports if and only if the policy is monotonic according to (16.12) and (16.13). 


The payment collected from allocated agents is the critical-value payment. The 
following example illustrates a stochastic policy that satisfies this monotonicity re- 
quirement. 


Example 16.26 Consider a domain with no early arrival and no late departure 
misreports, two time periods T = {1, 2}, fix 6_;, and consider agent i with a 
single-item valuation and possible types ©; = {(1, 1, w;), (1, 2, w;), (2, 2, w,)}. 
For impatient type (1, 1, w;), consider policy 


0, if w; < 8 
m((1, 1, w;), 0-1) = 1 “=, if 8 < w; < 10 (16.15) 
1, otherwise. 
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Solving for the critical value payment (16.14), we find 


0, if w; < 8 
: w2 : 
Vit,w)9-1) = 4 SE — 16, if8 < w; < 10 (16.16) 
9, otherwise. 


The policy and critical value payment is defined identically for type (2, 2, w;). 
For patient type (1, 2, w;), consider policy 


oa if0 < w; < 10 
mi((1, 2, wi), 0-;) = m2 if 10 < w; < 15 (16.17) 
1, otherwise 


and the critical value payment, from (16.14), is 


“i if0 <w; < 10 
Y2,w Oi) = 1 i — 8, if 10 < w; < 15 (16.18) 


8.75, otherwise. 


Notice that 7;((1, 1, 10), 6_;) = 1 andz;((1, 2, 10)) = 0.5, contradicting more 
simplistic notions of monotonicity, but that truthfulness is retained because 
UG. 1,10)(9-a) = 9 while UG .,2,10) (9-1) = 2.5. Although type (1, 2, 10) can misre- 
port to (1,1, 10) and be allocated with certainty, it prefers to report (1, 2, 10) 
because its expected utility is (0.5)(10 — 2.5) + (0.5)(0) > (1.0)(10 — 9). We 
leave as an exercise to check that these policies satisfy monotonicity, with 
fin mC, 2, x), 0-Ddx > [" mi(C, 1, x), 0s) for all w;. 


pe 


We make a final remark about stochastic policies. In an environment with a prob- 
abilistic model that is common knowledge, and that defines both a probability distri- 
bution for agent types and for stochastic events w € Q, we can settle for a weaker 
monotonicity requirement in which (16.12) and (16.13) are satisfied in expectation, 
given the model. However, this provides BNIC but not DSIC since monotonicity may 
not hold out of equilibrium when other agents are not truthful, since the probabilistic 
model of agent types upon which monotonicity is predicated would then be incorrect. 


16.4 Bayesian Implementation in Online Domains 


In this section we focus on Bayesian implementation of expected value-maximizing 
policies in environments in which the designer and every agent has a correct, prob- 
abilistic model for types and uncertain events, and this is common knowledge. We 
consider the goal of value maximization and present a dynamic variation of the of- 
fline Vickrey—Clarke-Groves (VCG) mechanism. This will involve computing ex- 
pected value maximizing sequential decision policies and raise a number of computa- 
tional challenges. We will see that the dynamic VCG mechanism is BNIC rather than 
DSIC, with incentive-compatibility contingent on future on-equilibrium play by all 
participants. 


432 ONLINE MECHANISMS 


16.4.1 A General Model 


A Markov decision process (MDP) provides a useful formalism for defining on- 
line mechanisms in model-based environments with general agent preferences. An 
MDP model (H, K,?, R) is defined for a set of states H, feasible decisions K(h) in 
each state, a probabilistic transition function P(h'*'|h', k‘) on the next state given 
current state and decision (with oj¢qe1 P(h'|h', k’) = 1) and a reward function 
R(h', k') € R for decision k’ in state h'. The Markov property requires that feasi- 
ble decisions, transitions, and rewards depend on previous states and actions only 
through the current state. It is achieved here, for example, by defining h’ « H’ = 
(6',..., 0%: @!,...,@';k!,...,k'~!) so that the state captures the complete history of 
types, stochastic events, and decisions. In practice, a short summarization of state h’ is 
often sufficient to retain the Markov property. 

Given a social planner interested in maximizing total value, then define reward 
Rh kk) = ier(n') R;(h', k"), with [(h') used to denote the set of agents present in 
state h’ and agent i’s reward R;(h', k’) is defined so that v;(0;, k) = aan R,(h', k‘) 
for all sequences of decisions k. For finite time horizons, the expected value of policy 
z in state h’ is V7(h') = Dp Selle R(h*, w°(h"))}, where the expectation is taken 
with respect to the transition model and given the state-dependent decisions implied by 
policy z. For infinite time horizons, a standard approach is to define a discount factor 
y € (0, 1) so that the expected discounted value of policy z in state h' is V7(h') = 
De pa y**R(ht, 2*(h*))}. This makes sense in a multiagent environment when 
every agent has the same discount factor y. 

Given MDP value, V7 (h‘), then the optimal policy 7* maximizes this value, V7(h‘), 
in every state h’. For instance, in the finite time-horizon (no discounting) setting, the 
optimal MDP-value function, V*, is defined to satisfy recurrence: 


V*(h) = max feo. S Pune]. (16.19) 
keK*(h) 


t 
wis ( h'cH'+! 


for all time ¢ and all h € H’. Given this, the optimal decision policy solves: 


m*(h € H’) sie Ca + ys P(h'|h, k)V*(h } ; (16.20) 


hW'eH't1 


Of course, the type information within the state is private to agents and we will need 
to provide incentive compatibility so that the policy has the correct view of the current 
state. 


Example 16.27 The definition of state, feasible decision, and agent type is as 
in Example 16.3. The transition function P(h'*!|h', k’) is constructed to reflect 
a probabilistic model of new agent arrivals, and also the allocation decision. The 
MDP reward function, R(h’, k'), can be defined with R(h', k') = w; if decision 
k' allocates the item to agent i, for some agent i present in the state, and zero 
otherwise. 
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16.4.2 A Dynamic Vickrey—Clarke—Groves Mechanism 


For concreteness, consider an environment with a finite time horizon and no discount- 
ing, and with the optimal MDP value V*(h) defined as the total expected reward from 
state h until the time horizon. We make some remarks about how to handle an infinite 
time horizon in Section 16.4.3. Consider the following dynamic VCG mechanism.® 
We assume that the decisions and reports in previous periods t’ < ¢ are all public in 
period ¢, although similar analysis holds without this. 


Auction 4. The dynamic VCG mechanism for the finite time horizon and no- 
discounting online MD environment works as follows: 


(i) Each agent, i, reports a type 6; in some period a; > aj. 

(ii) Decision policy: Implement optimal policy z*, which maximizes the total ex- 
pected value, assuming the current state as defined by agent reports is the true 
state. 

(iii) Payment policy: In an agent’s reported departure period, t = d;, collect payment 


xi(h') = ,6;, 2*(6=, o%)) — [V*(n®) — V*(h%,)], (16.21) 


where zr*(0=", w=") denotes the sequence of decisions made up to and including 
period t based on types = and stochastic events w=", V*(h‘) is the optimal MDP 
value in state h', and h'_, defines the (counterfactual) MDP state constructed to 
be equal to h’ but removing agent i’s type from the state. The payment is zero 
otherwise. 


Agent i’s payment is its ex-post value discounted by term (V*(h%) — vi(n.)), 
which is the expected marginal value it contributes to the system as estimated upon its 
arrival and based on its report. With this, the expected utility to agent i when reporting 
truthfully is equal to the expected marginal value that it contributes to the multiagent 
system through its presence. 

For incentive-compatibility, we need the technical property of stalling, which re- 
quires that the expected value of policy 2* cannot be improved (in expectation) by 
delaying the report of an agent.’ In addition, we assume an independence property; 
namely, the probabilistic process defining the arrival of agents other than i is indepen- 
dent of whether or not agent i has arrived. 


Theorem 16.28 The dynamic VCG mechanism, coupled with a policy that sat- 
isfies stalling, is Bayes—Nash incentive compatible (BNIC) and implements the 
expected-value maximizing policy, in a domain with no early-arrival misreports 
but arbitrary misreports of departure. 


PROOF Consider the expected utility (defined with respect to its information in 
period a;) to agent i for misreport 6, € C(6;). Let c > 0 denote the number of 


8 The mechanism is presented in the no early-arrival misreports model but remains BNIC without this assumption. 
9 This is typically reasonable, for example any optimal policy that is able to delay for itself any decisions that 
pertain to the value of an agent will automatically satisfy stalling. 
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periods by which agent i misreports its arrival time. The agent’s expected utility 
is 


IT | 
Ee {ui(O;, 0 °(h™))18;} +Ene } SY Ri(h’, w*(h'))} —Ene {V7}. 
t=aj;+c 
(A) (B) (C) 


Term (A) denotes the expected value to agent i given its misreport. Term (B), 
which denotes the total expected value to other agents forward from reported 
arrival, a; +c, given agent i’s misreport, corresponds to the expected value of 
terms {—v;(0;, 7*(0=“, w3%)) + V*(h%)} in the payment. Notation R_; denotes 
the total reward that accrues due to all agents except agent 7. Term (C), which 
denotes the total expected value to other agents forward from period a; + c, but 
with agent i removed, corresponds to the final term in the payment. Now, add term 
OA Daal ' R_,(h', 1*(h'))} to term (B) and subtract it again from term (C). 
The adjusted term (C’) is now agent independent (by the independence property) 
and can be ignored for the purpose of establishing BNIC. Term (A) combined 
with adjusted term (B’) is the expected value to all other agents forward from 
period a;, plus the expected true value to agent i. Agent i’s best response is to 
report its true type (and immediately upon arrival) because the policy z* is defined 
to maximize (A)+(B’) when the other agents are truthful, i.e. in a Bayes—Nash 
equilibrium. 


It bears repeating that truth telling is not a dominant strategy equilibrium. We have 
instead BNIC because the correctness of the policy depends on the center having 
the correct model for the distribution on agent types. Without the correct model, the 
policy is not optimal in expectation and an agent with beliefs different from that of the 
center may be able to improve (its belief about) the expected utility it will receive by 
misreporting its type and thus misrepresenting the state.!° 


16.4.3 Remarks 


We end this section with some general remarks that touch on the computational aspects 
of planning in model-based environments, and also describe a couple of additional 
environments in which dynamic VCG mechanisms can be usefully applied. 


Computational notes. Many algorithms exist to compute optimal decision policies 
in MDPs. These include dynamic programming, value iteration, policy iteration, and 
LP-based methods. However, the state space and action space for real-world online 
MD problems are large and approximations will typically be required. One appealing 
method is to couple the VCG mechanism with an online, sampling-based approximation 
algorithm. Rather than compute a priori an entire policy for every possible state one can 


10 Ex-post IR is achieved when the environment satisfies agent-monotonicity, which requires that introducing an 
agent increases the MDP value of any state. The payments collected by the mechanism are nonnegative in 
expectation (ex ante BB) when the environment satisfies no positive externalities, which requires that the arrival 
of an agent does not have a positive expected effect on the total value of the other agents. 


CONCLUSIONS 435 


determine the next decision to make in state h’ by approximating the decision problem 
forward from that state. Given an €-approximation, the dynamic VCG mechanism is 
€-BNIC, in the sense that no agent can gain more than some amount € > 0 (that can be 
made arbitrarily small) by deviating from truthful reporting, as long as the other agents 
are truthful and an €-accurate estimate of the optimal MDP value is also available. One 
class of online, sparse-sampling algorithms work by building out a sample tree of future 
states based on decisions that could be made by the policy forward to some look-ahead 
horizon. These algorithms have run time that is independent of the size of the state space 
but scales exponentially in the number of decisions and in the look-ahead horizon. More 
recently, a family of stochastic online combinatorial optimization algorithms has been 
proposed that seem especially applicable to online MD environments. The algorithms 
solve a subclass of MDPs in which the realization of uncertainty is independent of 
any decision. This is often a natural assumption for truthful dynamic auctions: the 
allocation decisions made by an IC auction will not affect the reports of agents, and 
thus the realization of new types is independent of decisions. 


Infinite time horizon and discounting. The dynamic VCG mechanism can be ex- 
tended to handle an infinite time horizon when every agent has a common discount 
factor. Rather than collect a payment once, upon departure, a payment can be collected 
from agent i in each period, so as to align its utility stream with the expected, marginal 
stream of value that it contributes through its presence in the multiagent system. 


Coordinated learning. A variant on the dynamic VCG mechanism can be used to 
support optimal, coordinated learning among a fixed population of self-interested 
agents. Suppose that in addition to influencing the reward received by an agent in 
each time period, the decisions made by a mechanism also reveal information that 
an agent can use to update its belief about its type; i.e., types are revealed online. A 
simple model is provided by a multiagent variation on the classical multi-armed bandits 
problem. Each agent owns an “arm” and receives a reward when its arm is activated, 
sampled from a stationary distribution. The reward signals are privately observed and 
allow an agent to update its model for the reward on its arm. In a setting with an 
infinite time horizon and discounting, one can use Gittins’ celebrated index policy 
to characterize an efficient online policy that makes the optimal trade-off between 
exploitation and exploration. In the presence of self-interest, a variant on the dynamic 
VCG mechanism can provide incentives to support truthful reporting of reward signals 
by each agent, and thus implement the efficient learning policy. 


16.5 Conclusions 


We briefly consider some of the many possible future research directions in the area of 
online mechanism design: 


¢ Revenue: Little work exists on the design of revenue-maximizing online mechanisms 
in model-based environments. For example, the problem of designing an analog to 
Myerson’s optimal auction is only partially solved, even in the very simplest of online 
settings. 
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¢ Learning by the center: It is interesting to allow the mechanism to improve its proba- 
bilistic model of the distribution on agent types across time, while retaining incentive 
compatibility along the path of learning, and seek to converge to an efficient or revenue- 
optimal mechanism. 

¢ Alternative solution concepts: Introduce weaker solution concepts than DSIC that avoid 
the strong common knowledge assumptions that are required to justify BNIC analysis. 
These could include, for instance, set Nash equilibria, implementation in undominated 
strategies, or implementation in minimax-regret equilibria and other robust solution 
concepts. 

e Endogenous information: Extend online MD to domains in which decisions made by 
the mechanism affect the information available to agents about their types; i.e., cast 
online MD as a general problem of coordinated learning by self-interested agents in an 
uncertain environment. 

¢ Richer domains: The current work on dominant-strategy implementation is limited 
to single-valued preference domains with quasi-linear utilities. Simple generalizations, 
such as to an environment in which some agents want an apple, some a banana, and some 
are indifferent across an apple and a banana do not satisfy the partition requirement on 
the structure of interesting sets and remain unsolved. Similar complications occur when 
one incorporates budget constraints, or generalizes to interdependent valuations. With 
time, perhaps progress can be made on the problem of online combinatorial auctions 
(and exchanges) in their full generality. 


16.6 Notes 


Lavi and Nisan (2000) coined the term online auction and initiated the study of truthful 
mechanisms in dynamic environments within the computer science literature. Friedman 
and Parkes (2003) later coined the term online mechanism design. The characterization 
of monotonicity requirements for truthful online mechanisms in single-valued domains 
is based on Hajiaghayi et al. (2005), with extensions to single-valued preferences 
building on Babaioff et al. (2006), see also Chapter 12.!! Weak-monotonicity and its 
role in truthful mechanism design are discussed in Bikhchandani et al. (2006). 

The discussion of the secretary problem and adaptive truthful auctions in the single- 
item setting is based on Hajiaghayi et al. (2004); see Babaioff et al. (2007) for a recent 
extension and (Gilbert and Mosteller, 1966; Dynkin, 1963) for classic references. The 
discussion of online mechanisms for expiring items is based on Hajiaghayi et al. (2005), 
and the negative result is due to Lavi and Nisan (2005), who also adopted an alternate 
solution concept in their analysis; see also (Ng et al., 2003; Porter, 2004; Juda and 
Parkes, 2006) and Awerbuch et al. (2003). Additional models of dynamic auctions in 
the computer science literature include unlimited supply, digital goods (Bar- Yossef 
et al., 2002; Blum et al., 2003; Blum and Hartline, 2005), two-sided auctions with both 
buyers and sellers (Bredin and Parkes, 2005; Blum et al., 2006), and interdependent 


'! The original paper by Hajiaghayi et al. (2005) mischaracterized the monotonicity requirement that is necessary 
for the truthful implementation of stochastic policies. This was originally brought to the attention of the authors 
by R. Vohra. The corrected analysis (presented here) is due to M. Mahdian. 
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value environments (Constantin et al., 2007). For an extended treatment of the single- 
valued setting, see Parkes and Duong (2007). 

Moving to the model-based framework, the discussion of the dynamic VCG mech- 
anism is based on Parkes and Singh (Parkes and Singh, 2003; Parkes et al., 2004). A 
general presentation in given in Bergemann and Valimaki (2006b), whose work along 
with that of Cavallo et al. (2006) and Bapna and Weber (2006) pertains to a model 
of coordinated learning; see also (Bergemann and Valimaki, 2003, 2006a; Athey and 
Segal, 2007). Pai and Vohra (2006) advance the study of revenue-optimal online mech- 
anisms in model-based environments, and together with Gallien (2006) work to extend 
Myerson’s (1981) optimal auction to dynamic environments; see also Cremer et al. 
(2007). The observation about the failure of the revelation principle, the example to 
illustrate the role of nonnegative payments, as well as inspiration for the example of 
a truthful, stochastic policy are due to Pai and Vohra (2006). For references on on- 
line algorithms and methods for solving sequential decision problems, see (Borodin 
and El-Yaniv, 1998; Van Hentenryck and Bent, 2006; Puterman, 1994; Kearns et al., 
1999). 
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16.1 


16.2 


16.3 


16.4 


16.5 


16.6 


16.7 


16.8 


Exercises 


Prove that the revelation principle holds with no early-arrival and no late-departure 
misreports and prove the “revelation principle + heartbeats” result in combination 
with no early-arrival misreports. 


Consider a (known interesting set) single-valued preference domain with no late- 
departure misreports. Show that any decision policy z that can be truthfully im- 
plemented by an IR mechanism, and does not pay unallocated agents, must be 
monotonic-early (for a suitable definition of monotonic-early). 


Prove that the approach outlined to constructing truthful online auctions in terms 
of an agent-independent price schedule gj(L , 6_;, w) induces a monotonic-late de- 
cision policy and critical-value payments. How would you modify the construction 
for an environment with both no early-arrival and no late-departure misreports? 


Construct an example to show that the greedy auction in the expiring items setting 
has an arbitrarily bad competitive ratio with respect to offline VCG revenue. 


Establish that the self-consistency property on prices in Section 16.3.4, coupled 
with the condition that a mechanism selects an outcome that maximizes utility for 
every agent at these prices is sufficient for truthfulness. Prove that the condition 
reduces to agent-independent prices for unrestrictedxs misreports. 


Prove that modifications (i-iii) in Section 16.3.4 are sufficient to achieve truthful- 
ness with agents with unknown interesting sets, together with no early-arrival and 
no late-departure misreports and a critical-value payment. What could break if the 
interesting sets are not disjoint, or if the policy is not minimal? 


Show that the stochastic policy outlined in Example 16.26 satisfies monotonicity 
conditions (16.12) and (16.13). 

Define a dynamic VCG mechanism that works for infinite time horizon and agents 
with a common, known discount factor y € (0, 1). 
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CHAPTER 17 


Introduction to the Inefficiency 
of Equilibria 


Tim Roughgarden and Eva Tardos 


Abstract 


This chapter presents motivation and definitions for quantifying the inefficiency of equilibria in 
noncooperative games. We illustrate the basic concepts in four fundamental network models, which 
are studied in depth in subsequent chapters. We also discuss how measures of the inefficiency of 
equilibria can guide mechanism and network design. 


17.1 Introduction 


17.1.1 The Inefficiency of Equilibria 


The previous two parts of this book provided numerous examples demonstrating that 
the outcome of rational behavior by self-interested players can be inferior to a centrally 
designed outcome. This part of the book is devoted to the question: by how much? 

To begin, recall the Prisoner’s Dilemma (Example 1.1). Both players suffer a cost 
of 4 in the unique Nash equilibrium of this game, while both could incur a cost of 2 by 
coordinating. There are several ways to formalize the fact that the Nash equilibrium in 
the Prisoner’s Dilemma is inefficient. A qualitative observation is that the equilibrium 
is strictly Pareto inefficient, in the sense that there is another outcome in which all of the 
players achieve a smaller cost. This qualitative perspective is particularly appropriate 
in applications where the “cost” or “payoff” to a player is an abstract quantity that 
only expresses the player’s preferences between different outcomes. However, payoffs 
and costs have concrete interpretations in many applications, such as money or the 
delay incurred in a network. We can proceed more quantitatively in such applications 
and posit a specific objective function, defined on the outcomes of the game, that 
numerically expresses the “social good” or “social cost” of an outcome. Two prominent 
objective functions are the utilitarian and egalitarian functions, defined as the sum of 
the players’ costs and the maximum player cost, respectively. The Nash equilibrium in 
the Prisoner’s Dilemma does not minimize either of these objective functions. 


443 


444 INTRODUCTION TO THE INEFFICIENCY OF EQUILIBRIA 


Introducing an objective function enables us to quantify the inefficiency of equilib- 
ria, and in particular to deem certain outcomes of a game optimal or approximately 
optimal. The primary goal of this part of the book is to understand when, and in what 
senses, game-theoretic equilibria are guaranteed to approximately optimize natural ob- 
jective functions. Such a guarantee implies that selfish behavior does not have severe 
consequences, and thus the benefit of imposing additional control over players’ actions 
is relatively small. Guarantees of this sort are particularly useful in many computer 
science applications, especially those involving the Internet, where implementing an 
optimal solution can be impossible or prohibitively expensive. 

In the remainder of this section, we discuss different measures that quantify the 
inefficiency of equilibria. In Section 17.2, we illustrate these concepts and motivate 
Chapters 18—21 via several examples in network games. Section 17.3 demonstrates how 
these same concepts provide a comparative framework for mechanism and network 
design. Section 17.4 concludes with bibliographic notes and suggestions for further 
reading. 


17.1.2 Measures of Inefficiency 


Several measures of “the inefficiency of the equilibria of a game” have been considered. 
All of these measures are defined, roughly, as the ratio between the objective function 
value of an equilibrium of the game and that of an optimal outcome. To specify such a 
measure precisely, we must answer the following basic modeling questions. 


(1) How are the payoffs or costs of players expressed? 

(2) What objective function do we use to compare different outcomes of the game? 
(3) What is our definition of “approximately optimal’? 

(4) What is our definition of an “equilibrium”? 

(5) When there are multiple equilibria, which one do we consider? 


We next discuss, at a high level, the most commonly studied answers to all of these 
questions. We give several examples in Section 17.2. 

The answer to the first question will be some concrete payoff that players seek to 
maximize (such as money earned), or a cost that players aim to minimize (such as 
network delay). Both cases arise naturally in the applications studied in this book. 

Second, we focus primarily on the utilitarian objective function, where the goal is to 
maximize the sum of players’ payoffs or minimize the sum of players’ costs. However, 
we also study the egalitarian objective function in Section 17.2.3 and Chapter 20. 
We call an outcome of a game optimal if it optimizes the chosen objective function. 
For example, in the Prisoner’s Dilemma, the coordinated outcome is optimal for both 
the utilitarian and egalitarian objective functions. While in principle the measures of 
inefficiency below make sense for most objective functions, we can only expect the 
outcome of selfish behavior to approximate an optimal outcome when the objective 
function is related to the players’ objectives. 

Third, we quantify the extent to which a given outcome approximates an optimal 
one according to the ratio between the objective function values of the two outcomes. 
We consider only nonnegative objective functions, so this ratio is always nonnegative. 
(By convention, we interpret the ratio c/0 as 1 if c = 0 and as +00 if c > 0.) This ratio 
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is at least 1 for minimization objectives and at most 1 for maximization objectives. 
In either case, a value close to 1 indicates that the given outcome is approximately 
optimal. For example, in the Prisoner’s Dilemma, the sum of the players’ costs in the 
Nash equilibrium is 8; since the minimum-possible sum of costs is 4, the corresponding 
ratio for the equilibrium outcome is 2. As Section 17.4 discusses, this use of a ratio 
is directly inspired by many similar notions of approximation that have been widely 
studied in theoretical computer science. While other notions of approximation are 
possible, almost all work on quantifying the inefficiency of equilibria has followed the 
approach taken here. 

Several equilibrium concepts have been studied in different applications. In this 
chapter, we confine our attention to Nash equilibria and their analogues in games 
where the set of players or strategies is not finite. One particularly important issue not 
addressed in this chapter is the relationship between the inefficiency of equilibria and 
the ability of players to reach an equilibrium. In other words, a bound on the inefficiency 
of the equilibria of a game is much more compelling if we expect players to learn or 
converge to one of these equilibria. In many of the games discussed in this part of 
the book, relatively weak assumptions imply that local, uncoordinated optimization 
by players leads to an equilibrium outcome in a reasonable amount of time (see 
Sections 4.7 and 29.3). Some important classes of network games, however, do not 
admit such convergence results. This fact motivated researchers to define novel notions 
of “equilibrium outcomes,” which include all Nash equilibria but also allow players to 
wander among a set of unstable outcomes. In some applications, all such equilibria, and 
not just the Nash equilibria, are guaranteed to be approximately optimal. Chapter 19 
briefly discusses some results of this type. See Section 17.4 for further details. 

Finally, given a choice of an objective function and an equilibrium concept, a 
game may have different equilibria with different objective function values; recall, for 
example, the coordination games of Section 1.1.3. In such games, it is not clear which 
equilibrium should be compared to an optimal outcome. Section 17.1.3 discusses the 
two most popular approaches. 


17.1.3 The Price of Anarchy and the Price of Stability 


The price of anarchy, the most popular measure of the inefficiency of equilibria, 
resolves the issue of multiple equilibria by adopting a worst-case approach. Precisely, 
the price of anarchy of a game is defined as the ratio between the worst objective 
function value of an equilibrium of the game and that of an optimal outcome. Note 
that the price of anarchy of a game is defined with respect to a choice of objective 
function and a choice of equilibrium concept. For example, as shown in Section 17.2.3 
below, the price of anarchy of a game is generally different for different choices of an 
objective function. 

We are interested in identifying games in which the price of anarchy is close to 1; 
in these games, all equilibria are good approximations of an optimal outcome. We 
view selfish behavior as benign in such games. Put differently, the benefit provided by 
(possibly costly or infeasible) dictatorial control over the players’ actions is reasonably 
small. 
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A game with multiple equilibria has a large price of anarchy even if only one of 
its equilibria is highly inefficient. The price of stability is a measure of inefficiency 
designed to differentiate between games in which all equilibria are inefficient and those 
in which some equilibrium is inefficient. Formally, the price of stability of a game is 
the ratio between the best objective function value of one of its equilibria and that of an 
optimal outcome. Of course, in a game with a unique equilibrium, its price of anarchy 
and price of stability are identical. For a game with multiple equilibria, its price of 
stability is at least as close to | as its price of anarchy, and it can be much closer (see 
Example 17.2 below). 

A bound on the price of stability, which ensures only that one equilibrium is ap- 
proximately optimal, provides a significantly weaker guarantee than a bound on the 
price of anarchy. Nevertheless, there are two reasons to study the price of stability. 
First, in some applications, a nontrivial bound is possible only for the price of stability 
(see Section 17.2.2). Second, the price of stability has a natural interpretation in many 
network games — if we envision the outcome as being initially designed by a central 
authority for subsequent use by selfish players, then the best equilibrium is an obvious 
solution to propose. Indeed, in many networking applications, it is not the case that 
agents are completely independent; rather, they interact with an underlying protocol 
that essentially proposes a collective solution to all participants, who can either accept it 
or defect from it. The price of stability measures the benefit of such protocols. Because 
of this interpretation, the price of stability is typically studied only for equilibrium 
concepts that involve no randomization, such as pure-strategy Nash equilibria. For ex- 
ample, since a mixed-strategy Nash equilibrium might randomize only over outcomes 
that are not (pure-strategy) Nash equilibria, it is not clear how to interpret it as a single 
proposed outcome for future use by selfish players. 

The price of stability thus quantifies the necessary degradation in solution quality 
caused by imposing the game-theoretic constraint of stability. The goal of seeking a 
good equilibrium is reminiscent of the general motives of mechanism design (Part IT) — 
designing a game outcome that (approximately) optimizes a social objective function 
and is also consistent with self-interested behavior. 

In this book, we will only quantify the inefficiency of the worst or the best equilib- 
rium of a game. A third interesting approach is to analyze a “typical” equilibrium. Such 
“average-case analyses” are notoriously difficult to define in a meaningful and analyt- 
ically tractable way, however, and this approach has not yet been used successfully to 
study the inefficiency of equilibria. 


17.2 Fundamental Network Examples 


Even in very simple games, equilibria can be arbitrarily inefficient. For example, 
consider the Prisoner’s Dilemma, and let the players’ costs in the Nash equilibrium 
tend to infinity. For every reasonable objective function, the objective function value 
of the unique Nash equilibrium is arbitrarily larger than that of the optimal outcome. 
Since the inefficiency of equilibria cannot be bounded in general, a natural goal is 
to identify classes of games in which equilibria are guaranteed to be approximately 
optimal. Fortunately, this is the case for a wide array of fundamental network models. 
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c(x)=1 


c(x) =x 


Figure 17.1. Pigou’s example. The cost function c(x) describes the cost incurred by users of an 
edge, as a function of the amount of traffic routed on the edge. 


This section illustrates the concepts defined in Section 17.1 by informally introducing 
four such models. Chapters 18—21 study these and related models in greater depth. 


17.2.1 Selfish Routing 


We begin with a model of “selfish routing” that is discussed extensively in Chapter 18. 
We introduce the model via Pigou’s example, which was first discussed in 1920 by the 
economist Pigou. 


Example 17.1 (Pigou’s Example) Consider the simple network shown in 
Figure 17.1. Two disjoint edges connect a source vertex s to a destination vertex f. 
Each edge is labeled with a cost function c(-), which describes the cost (e.g., travel 
time) incurred by users of the edge, as a function of the amount of traffic routed 
on the edge. The upper edge has the constant cost function c(x) = 1, and thus 
represents a route that is relatively long but immune to congestion. The cost of 
the lower edge, which is governed by the function c(x) = x, increases as the edge 
gets more congested. In particular, the lower edge is cheaper than the upper edge 
if and only if less than one unit of traffic uses it. We are interested in the price of 
anarchy of this game. 

Suppose that there is one unit of traffic, representing a very large population of 
players, and that each player chooses independently between the two routes from 
s to t. Assuming that each player aims to minimize its cost, the lower route is a 
dominant strategy. In the unique equilibrium, all players follow this strategy, and 
all of them incur one unit of cost. 

To define the optimal outcome, we assume that the objective function is to 
minimize the average cost incurred by players. In the above equilibrium, this 
average cost is 1. A simple calculation shows that splitting the traffic equally 
between the two links is the optimal outcome. In this outcome, half of the traffic (on 
the upper link) incurs cost 1, while the other half (on the lower link) experiences 
only 1/2 units of cost. Since the average cost of traffic in this optimal outcome 
is 3/4, both the price of anarchy and the price of stability in this game equal the 
ratio 1/(3/4) = 4/3. 
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General selfish routing games are conceptually similar to Pigou’s example, but 
are more complex in several respects: the network can be an arbitrarily large di- 
rected graph, different players can have different source and destination vertices, 
and edge cost functions can be arbitrary nonnegative, continuous, and nondecreasing 
functions. 

One property of Pigou’s example that holds more generally is that the price of 
anarchy and the price of stability are equal — that is, the average cost incurred by traffic 
is the same in all equilibria of the game. Chapter 18 proves this “essential uniqueness” 
property using a powerful and flexible technique called the potential function method. 
Roughly, a potential function for a game is a real-valued function, defined on the set of 
possible outcomes of the game, such that the equilibria of the game are precisely the 
local optima of the potential function. Not all games admit natural potential functions, 
but most of the ones discussed in this part of the book do. As we will see in Chapters 18 
and 19, when a game admits a potential function, there are typically consequences for 
the existence, uniqueness, and inefficiency of equilibria. 

One of the goals of Chapter 18 is to understand how the price of anarchy of a selfish 
routing game depends on different properties of the network. For example, recall that 
the price of anarchy in Pigou’s example is precisely 4/3. Does this bound degrade 
as the network size grows? As the number of distinct source and destination vertices 
increases? As the edge cost functions become increasingly nonlinear? If players control 
a nonnegligible fraction of the overall traffic? Chapter 18 provides answers to all of 
these questions. For example, in every network with affine cost functions (of the form 
ax + b), no matter how large and complex, the price of anarchy is at most 4/3. With 
arbitrary cost functions, even with the simple network structure shown in Figure 17.1, 
the price of anarchy can be arbitrarily large (Exercise 17.1). 


17.2.2 Network Design and Formation Games 


Chapter 19 studies a diverse set of models of network formation and network de- 
sign with selfish players. Here we discuss only one, with the goal of illustrating the 
differences between the price of anarchy and the price of stability. 

We define a Shapley network design game as follows. Like selfish routing games, 
such a network design game occurs in a directed graph G. Each edge e of the graph has 
a fixed, nonnegative cost c,. There are k players, and each player i is associated with 
a source vertex s; and a destination vertex t;. Player i wants to establish connectivity 
from its source to its destination, and its strategies are therefore the s;-t; paths of G. 
Given a choice of a path P, by each player 7, we assume that the formed network is 
simply the union U; P; of these. The cost of this network is the sum eG P, Ce of the 
costs of these edges, and we assume that this cost is passed on to the players in a natural 
way: the cost of each edge of the formed network is shared equally by the players who 
use it. More formally, each player i incurs cost c,/f. for each edge e of its path P;, 
where f. denotes the number of players selecting paths that contain the edge e. This 
defines a finite noncooperative game, and we are interested in the inefficiency of the 
pure-strategy Nash equilibria of this game. We assume that the social objective is to 
minimize the cost of the formed network. 
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Figure 17.2. Multiple Nash equilibria in Shapley network design games (Example 17.2). 


Example 17.2. Consider the network shown in Figure 17.2. There are k players, 
each with the same source s and destination t. The edge costs are k and 1+, 
where € > 0 is arbitrarily small. In the optimal outcome, all players choose the 
lower edge. This outcome is also a Nash equilibrium. On the other hand, suppose 
that all of the players choose the upper edge. Each player then incurs cost 1, and if 
a player deviates to the lower edge, it pays the larger cost of 1 + €. This outcome 
is thus a second Nash equilibrium, and it has cost k. 


The price of anarchy of the game in Example 17.2 is roughly the number of players, 
and we view this as unacceptably large. This example motivates the study of the price 
of stability of Shapley network design games. Recall from Section 17.1.3 that the price 
of stability has a natural interpretation in network formation games — it measures the 
inefficiency of the network that a designer would propose to selfish players (i.e., the 
best equilibrium). 

The price of stability in Example 17.2 is 1. The next example shows that this is not 
always the case. 


Example 17.3 (71; example) Consider the network shown in Figure 17.3. 
There are k players, all with the same sink tf, and € > 0 is arbitrarily small. 
For each i € {1,2,..., k}, the edge (s;, t) has cost 1/7. In the optimal outcome, 
each player i chooses the path s; — v — ¢ and the cost of the formed network 


Figure 17.3. The price of stability in Shapley network design games can be at least Hx 
(Example 17.3). 
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is | +. This is not a Nash equilibrium, as player k can decrease its cost from 
(1 + €)/k to 1/k by switching to the direct path s,; — t — indeed, this direct path 
is a dominant strategy for the kth player. Arguing inductively about the players 
k —1,k—2,...,1 shows that the unique Nash equilibrium is the outcome in 
which each player chooses its direct path to the sink. The cost of this outcome 
is exactly the kth harmonic number 7, = Ss /i), which is roughly Ink. The 
price of stability can therefore be (arbitrarily close to) H, in Shapley network 
design games. 


Our emphasis on pure-strategy Nash equilibria and Example 17.3 motivate the 
following two questions. 


(1) Does every Shapley network design game possess at least one pure-strategy Nash 
equilibrium? (Recall from Example 1.7 that not all games have such equilibria.) 
(2) What is the largest-possible price of stability in Shapley network design games? 


Chapter 19 uses the potential function method discussed in Section 17.2.1 to resolve 
both of these questions. This method answers the first question in the affirmative, and 
also shows that the price of stability in every k-player Shapley network design game 
is at most 7. In other words, for each value of k, the game in Example 17.3 has the 
largest-possible price of stability. 

Chapter 19 also discusses the price of anarchy and stability in other models of selfish 
network design and formation. 


17.2.3 Scheduling Games 


Our next example is a load-balancing scenario, where the goal is to spread several 
identical “jobs” evenly across a number of identical “machines.” This is a very simple 
type of scheduling problem; this and much more general scheduling models have been 
extensively studied and have numerous applications (see Chapter 20). We focus on 
this special case to illustrate a nonutilitarian objective function, mixed-strategy Nash 
equilibria, and the interaction between the two. 

Concretely, we assume that there are m jobs and m machines for some integer m > 1. 
Players correspond to jobs. The strategy set of each player is the set of machines. Each 
player i seeks to minimize the total number of players (including 7 itself) that select its 
machine. This defines a noncooperative game. The pure-strategy Nash equilibria of this 
game are precisely the m! outcomes in which each player selects a distinct machine. 
There are additional mixed-strategy Nash equilibria, as we discuss below. 

To study the price of anarchy, we require an objective function. Thus far, we have 
studied only utilitarian objective functions, where the goal is to minimize the sum of 
the players’ costs. Here, motivated by the goal of load-balancing, we focus primarily 
on the egalitarian objective of minimizing the number of jobs on the most crowded 
machine. This objective is called makespan minimization in the scheduling literature. 
The set of optimal outcomes under this objective coincides with the set of pure-strategy 
Nash equilibria, and these outcomes all have makespan equal to 1. 

In the previous two examples, we studied only pure-strategy equilibria, where the 
objective function value of an equilibrium is clear. In the present application, we 
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also consider mixed-strategy Nash equilibria. Such an equilibrium naturally induces 
a probability distribution on the set of game outcomes. Specifically, since we assume 
that the random choices made by distinct players are independent, the probability of 
a given strategy profile is the product of the probabilities that each player selects its 
prescribed strategy. 

We define the objective function value of a mixed-strategy Nash equilibrium as 
the expectation, over this induced distribution on the game outcomes, of the objective 
function value of an outcome. Thus the objective function value of a mixed-strategy 
Nash equilibrium is its “expected objective function value.” As we now show, allowing 
mixed-strategy Nash equilibria can increase the price of anarchy in scheduling games. 


Example 17.4 (Balls and Bins) In the above example with m jobs and m ma- 
chines, suppose that every player selects a machine uniformly at random. We 
claim that this defines a mixed-strategy Nash equilibrium. To prove it, adopt 
the first player’s viewpoint. Since each of the other m — 1 players chooses a 
machine uniformly at random, all m machines appear equally loaded. More for- 
mally, let X;; denote the indicator random variable for the event that player i 
selects the machine j. If the first player selects machine j, then it incurs a cost 
of 1+ >°,., Xi;. By linearity of expectation, its expected cost on this machine 
is 1+ )0;., ELXi;] = 2 —1/m. Since this expected cost is independent of the 
machine j, every pure strategy of the first player is a best response to the mixed 
strategies chosen by the other players. As a consequence, every mixed strategy of 
the first player is also a best response (recall Section 1.3.4). This argument clearly 
applies to the other players as well, and hence this set of mixed strategies forms 
a Nash equilibrium. 

What is the objective function value of this mixed-strategy Nash equilibrium — 
the expected value of the most heavily loaded machine? We emphasize that this 
expectation E[max ;{}°; X;;}] is not the same as the maximum expected load, 
max ;{E[}°; X;;]}], which is only 1. Intuitively, the expected number of jobs on 
the most crowded machine is governed by the severity of the “collisions” that 
occur when the players select machines in a randomized and uncoordinated way. 
This nontrivial problem, typically called the balls into bins problem, is classical 
and has been thoroughly analyzed. In particular, the objective function value of 
the above mixed-strategy Nash equilibrium is O(log m/ log log m) as m grows 
large. (See Chapter 20.) 

Collisions due to independent random choices therefore give rise to significant 
inefficiency: the price of anarchy with respect to pure-strategy Nash equilibria in 
this example is 1 for every m > 1, whereas the price of anarchy with respect to 
mixed-strategy Nash equilibria is Q(log m/ log log m) as m grows large. 


Example 17.4 shows that the price of anarchy can depend fundamentally on the 
choice of equilibrium concept; recall the fourth question of Section 17.1.2. As an 
aside, we note that it also illustrates the dependence of the price of anarchy on the 
choice of objective function. Specifically, consider the utilitarian objective function, 
where the goal is to minimize the sum of the players’ costs. The optimal outcomes again 
coincide with the pure-strategy Nash equilibria, and all of these have objective function 
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value equal to m. The objective function value of the mixed-strategy Nash equilibrium 
in Example 17.4 is defined as the expected sum of the players’ costs, which by linearity 
of expectation is the same as the sum of the players’ expected costs. The calculation 
in Example 17.4 shows that each player’s expected cost equals 2 — 1/m, and thus the 
objective function value of this mixed-strategy Nash equilibrium is precisely 2m — 1. 
This is in fact the worst equilibrium of the game (Exercise 17.3), and hence the price 
of anarchy for the utilitarian objective in this example is only 2 — 1/m. 

We could also consider the objective of minimizing the maximum expected load, 
instead of the expected maximum load, experienced by a player. Both of these objectives 
can be viewed as egalitarian objectives, and they assign the same objective function 
value to every pure strategy profile. In particular, these objective functions have identical 
optimal values. However, they typically assign different values to a profile of mixed 
strategies. For example, the maximum expected load experienced by a player in the 
mixed-strategy Nash equilibrium in Example 17.4 is only 2 — 1/m. This is the worst 
equilibrium (as in Exercise 17.3), and the price of anarchy with respect to the maximum 
expected load of a player is therefore only 2 — 1/m. An arguably undesirable feature 
of this objective is that the price of anarchy is small even though, with high probability, 
the players’ random strategy selections produce a pure strategy profile with objective 
function value Q(log m/ log log m) times that of optimal. 

Returning to the makespan minimization objective considered in Example 17.4, 
Chapter 20 proves that the price of anarchy is bounded above by O(log m/ log log m) 
in load-balancing games with n jobs and m machines, even when the machines are 
“nonuniform” in a certain precise sense. Chapter 20 also studies the price of anarchy 
in several variants of this scheduling game. 


17.2.4 Resource Allocation Games 


We next study a game that is induced by a natural protocol for allocating resources to 
players with heterogeneous utility functions. Chapter 21 studies such games in much 
greater depth. 

We consider a single divisible resource — the bandwidth of a single network link, 
say — to be allocated to a finite number n > 1 of competing players. We assume 
that each player i has a concave, strictly increasing, and continuously differentiable 
utility function U;. A resource allocation game is defined by the n utility functions 
U,,..., U, and the link capacity C > 0. An outcome of such a game is a nonnegative 
allocation vector (x1, ...,X,) with }°; x; = C, where x; denotes the amount of band- 
width allocated to player i. We study the utilitarian objective, and are thus interested 
in maximizing the sum }°, U;(x;) of the players’ utilities. 

The proportional sharing protocol allocates bandwidth as follows. Each user ex- 
presses its interest in receiving bandwidth by submitting a nonnegative bid b;. The 
protocol then allocates all of the bandwidth in proportion to the bids, so that each user i 
receives 


-C (17.1) 
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units of bandwidth. Player i is then charged its bid b;. See Section 17.3 and Chapter 21 
for a discussion of alternative protocols that have a similar flavor. 

We assume that player payoffs are quasilinear in the sense of Section 9.3. In other 
words, the payoff Q; to a player i is defined as its utility for the bandwidth it receives, 
minus the price that it has to pay: 


O;(b,...,bn) = U;(x;) — bj => U; (ss ) — bj. (17.2) 
pare bj 
Assume that if all players bid zero, then all users receive zero payoff. Our restrictions on 
the utility function U; ensure that the payoff function Q; is continuously differentiable 
and strictly concave in the bid b; for every fixed vector b_; with at least one strictly 
positive component (Exercise 17.4). (As usual, b_; denotes the vector of bids of players 
other than i.) 
An equilibrium is a bid vector in which every user bids optimally, given the bids of 
the other users. 


Definition 17.5 A bid vector (b),..., b,) is an equilibrium of the resource 
allocation game (U;,..., U,, C) if for every useri € {1,2,..., n}, 
O;(b;, b-;) = sup O;(b;, b-i). (17.3) 
b;>0 


The potential function method also applies to resource allocation games. This method 
can be used to show that, for every resource allocation game, every equilibrium bid 
vector induces the same allocation. Thus, every equilibrium has equal objective function 
value. The next example shows that equilibria in resource allocation games can be 
inefficient. 


Example 17.6 Consider a resource allocation game in which the capacity C 
is 1, the first user has the utility function U;(x;) = 2x,, and the other n — | users 
have the utility function U;(x;) = x;. In the optimal allocation, the first player 
receives all of the bandwidth and the corresponding objective function value is 2. 
This allocation does not, however, arise from an equilibrium. To see why, observe 
that (17.1) implies that the only bid vectors that induce this allocation are those 
in which only the first player submits a positive bid. Such a bid vector cannot be 
an equilibrium, as the first player can bid a smaller positive amount and continue 
to receive all of the bandwidth. (See also Exercise 17.5.) 

A similar argument holds whenever the first player’s bid is a sufficiently large 
fraction of the sum of the players’ bids: if the first player lowers its bid, its 
allocation diminishes, but the effective “price per unit of bandwidth” that it 
pays decreases by a large enough amount to increase its overall payoff. More 
formally, suppose that (b;,..., b,) is an equilibrium, and let B denote the sum 
of the bids. By Exercise 17.5, at least two of the bids are strictly positive. By 
definition, the bid b, satisfies (17.3). Since the payoff function Q, is continuously 
differentiable and strictly concave in the bid b; with b_, fixed (Exercise 17.4), 
we can compute b by differentiating the right-hand side of (17.3) and setting this 
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derivative to zero. Starting from the defining equation (17.2) of the function Q;, 
using that U,(x,;) = 2x; and C = 1, and calculating, we obtain the condition 
2(B — b,)/B* = 1. Fora player i > 1, the same calculation yields the condition 
(B — b;)/B* = 1. Subtracting the second equation from the first implies that 
2b, — b; = B for every i = 2,3,...,n. Adding these n — 1 equations together 
gives 2(n — 1)b, — (B — bi) = (n — 1)B; solving, we find that the first player’s 
bid is only an n/(2n — 1) fraction of the sum of the bids: b} = nB/(2n — 1). In 
the resulting allocation, the first player obtains only an n/(2n — 1) fraction of the 
bandwidth. As n grows large, roughly half of the bandwidth is allocated to the 
first player, while the rest is split equally between the other n — 1 players. The 
objective function value of this allocation is roughly 3/2, which is only a 3/4 
fraction of the value of the optimal allocation. 


Intuitively, inefficiency arises in Example 17.6 because of “market power” — the fact 
that a single player receives the lion’s share of the total bandwidth in the optimal allo- 
cation. Indeed, resource allocation games were initially studied under the assumption 
that no users have nontrivial market power; in this case, equilibria are fully efficient 
and the price of anarchy is 1. Details are discussed in Chapter 21. Chapter 21 also uses 
the price of anarchy as a criterion for mechanism and protocol design; we foreshadow 
this work in the next section. 


17.3 Inefficiency of Equilibria as a Design Metric 


17.3.1 Motivation 


In the previous section, we studied four natural network examples. The game was given 
and immutable in all of these examples, and the only question involved the quality of its 
equilibria. While most work on the inefficiency of equilibria has been of this form, the 
flexibility of the framework presented in Section 17.1.2 begs a more general question: 
how can we design a game, or modify an existing game, to minimize the inefficiency 
of its equilibria? This question is especially crucial in settings where equilibria are 
unacceptably inefficient, but directly imposing an optimal solution is impractical. 

Example questions of this type include the following. Among a given class of mech- 
anisms, which one induces a game with the best price of anarchy? Quantitatively, 
what is this best-possible price of anarchy? Given a game and a restricted set of op- 
tions for influencing its equilibria, which option improves the price of anarchy by 
the maximum-possible amount? How large is the improvement? Using the measures 
of inefficiency described in Section 17.1.2, we can rigorously compare the perfor- 
mance of different solutions, and quantify the efficiency loss obtained by an optimal 
solution. 

These goals are conceptually the same as those of algorithmic mechanism design, 
studied in Part II of this book. However, much of the work we describe below and 
in the notes (Section 17.4) differs from the bulk of the material in Part II in three 
technical respects. First, while Part II largely concerns the design of strategyproof 
mechanisms in which truthful revelation is a dominant strategy for every player, we 
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study the equilibria of mechanisms that are not generally strategyproof. For example, 
in the proportional sharing mechanism described in Section 17.2.4, the strategy space 
of a player (nonnegative bids) does not coincide with its type space (utility functions), 
and no player has a dominant strategy. These differences are typically motivated by 
practical considerations, as we discuss in Section 17.3.2. Second, some of the research 
described in Section 17.4 considers games without private preferences. In these cases, 
the design problem is nontrivial because the mechanism designer lacks full control over 
the allocation of resources. Optimally influencing traffic in a selfish routing network 
by pricing the network edges is one example of such a problem. Third, in much of the 
work discussed in Section 17.4, the problem is not to design a good mechanism from 
scratch, but rather to leverage a limited amount of power to improve the equilibria of 
a given game as much as possible. 


17.3.2 An Example: The Proportional Sharing Mechanism 


We now informally describe one example of how the inefficiency of equilibria can 
serve as a design metric. Chapter 21 discusses the following result in greater detail, 
and Section 17.4 discusses additional examples. 

Recall the resource allocation games of Section 17.2.4, where n players compete 
for a divisible link with capacity C. We studied the proportional sharing mechanism, 
in which each player i submits a bid b; to the mechanism, the mechanism allocates 
all of the bandwidth to the players in proportion to their bids, and every player then 
pays its bid. This mechanism induces a noncooperative game; as proved in Chapter 21, 
the price of anarchy of every such game is at least 3/4. We next strive to surpass this 
efficiency guarantee and ask: how can we modify the mechanism so that the price of 
anarchy is always even closer to 1? 

The answer to this question depends crucially on the class of mechanisms that we 
are willing to consider. If we impose no restrictions on the allowable mechanisms, 
then a version of the VCG mechanism (see Chapter 9) always induces a game for 
which the price of anarchy equals 1. However, this solution is “more complicated” than 
the proportional sharing mechanism in two ways. First, the communication from the 
players to the mechanism is more involved; each player must submit a representation of 
its entire utility function, as opposed to a single bid. Second, the communication from 
the mechanism back to the players is also more complicated in the following sense. In 
the proportional sharing mechanism, allocations can be completely summarized by the 
bids and a single additional parameter, the price of bandwidth. To see this, consider 
a bid vector (b;,...,D,) for a link with capacity C. Set a price p equal to B/C, 
where B is the sum of the bids. The (proportional) allocation to each player i is then 
simply its bid b; divided by this price. While the allocations of the VCG mechanism 
can be similarly interpreted in terms of prices, different players are generally allocated 
bandwidth according to different prices. 

The simplicity of the proportional sharing mechanism — that the communication 
both to and from the mechanism is limited — makes it particularly suitable for im- 
plementation in large communication networks. Is there a mechanism that retains 
these appealing properties and has strictly smaller worst-case efficiency loss? Chap- 
ter 21 shows that the answer is “no” — for an appropriate definition of “bounded 
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communication,” every equally simple mechanism can induce a game that has a price 
of anarchy of at most 3/4. The proportional sharing mechanism is therefore optimal 
among all mechanisms meeting natural, desirable implementation constraints. 


17.4 Notes 


The observation that self-interested behavior can lead to a socially inefficient outcome 
is an old one; see, for example, Dubey (1986), Rapoport and Chammah (1965), and 
the references therein. The idea of quantifying the inefficiency of equilibria using an 
objective function and an approximation measure is much newer. The concept of the 
price of anarchy originated in Koutsoupias and Papadimitriou (1999), where it was 
called the coordination ratio. Koutsoupias and Papadimitriou studied a generalization 
of the scheduling games described in Section 17.2.3. Papadimitriou (2001) introduced 
the term “the price of anarchy.” The price of stability was first studied in Schulz and Stier 
Moses (2003); the terminology is from Anshelevich et al. (2004). Several earlier works, 
and in particular Mason (1985), anticipated these concepts. See also Satterthwaite and 
Williams (1989) and Moulin and Shenker (2001), who studied additive notions of 
efficiency loss in mechanism design applications. 

The measures of inefficiency discussed in Section 17.1 are similar to and motivated 
by several well-established concepts in theoretical computer science. One example is 
the approximation ratio of a heuristic for a (typically NP-hard) optimization prob- 
lem, defined as the worst ratio between the objective function value of the solution 
produced by the heuristic and that of an optimal solution (Vazirani, 2001). While the 
approximation ratio measures the worst-case loss in solution quality due to insuffi- 
cient computational effort, the price of anarchy measures the worst-case loss arising 
from insufficient ability (or willingness) to control and coordinate the actions of selfish 
individuals. 

The novel notions of “equilibrium outcomes” alluded to in Section 17.1.2 are de- 
scribed in Mirrokni and Vetta (2004) and Goemans et al. (2005). Tennenholtz (2002) 
also proposed relaxing the assumption that players reach a Nash equilibrium, and exam- 
ining the consequences for the players’ payoffs. The inefficiency of other equilibrium 
concepts has also been studied; see work by Christodoulou and Koutsoupias (2005) on 
correlated equilibria (Section 1.5), Andelman et al. (2007) on strong Nash equilibria 
(Section 1.6), and Hayrapetyan et al. (2006) on equilibria in the presence of coalitions 
of players. 

Pigou’s example (Example 17.1) is from Pigou (1920). Selfish routing networks 
and their equilibria were defined formally by Wardrop (1952) and Beckmann et al. 
(1956). The potential function method originates in Beckmann et al. (1956) and was 
later developed by Rosenthal (1973), Monderer and Shapley (1996), Roughgarden and 
Tardos (2002), and Anshelevich et al. (2004). Shapley network design games were 
first studied by Anshelevich et al. (2004), and Example 17.3 is from Anshelevich et al. 
(2004). Example 17.2 was given in an earlier paper by Anshelevich et al. (2003). The 
scheduling games of Section 17.2.3 and Example 17.4 are due to Koutsoupias and 
Papadimitriou (1999). See Motwani and Raghavan (1995) for a discussion of the balls 
into bins problem. The proportional sharing mechanism is due to Kelly (1997), and 
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Example 17.6 is from Johari and Tsitsiklis (2004). For further references on the four 
network models of Section 17.2, see Chapters 18-21. 

The results mentioned in Section 17.3.2 are from Johari and Tsitsiklis (2006), and 
are discussed in further detail in Chapter 21. Chapter 21 also covers variants of the VCG 
mechanism in which users submit only a single bid, rather than an entire utility function. 
These mechanisms are not (and cannot be) single-price in the sense of Section 17.3.2, 
however. 

We conclude these notes with examples of how measures of inefficiency have been 
used to compare different mechanisms and different strategies for influencing equilibria 
in the network models explored in Section 17.2. Several approaches to improving the 
equilibria of a selfish routing network have been considered, including pricing the 
network edges, and routing a small fraction of the traffic in a centralized manner. The 
goal is then to leverage the limited amount of design power to minimize the price of 
anarchy. For details on this literature, see Roughgarden (2005, Chapters 5—6) and the 
references therein. 

Motivated by the network design games of Section 17.2.2 and Example 17.3, Chen 
et al. (2007) studied how to design cost-sharing methods to minimize the inefficiency 
of equilibria in the resulting network game. One of the contributions in Chen et al. 
(2007) is an analogue of the result described in Section 17.3.2 for resource allocation 
mechanisms: among all cost-sharing methods that are “oblivious” to the network 
structure in a certain precise sense, the Shapley cost-sharing method of Section 17.2.2 
minimizes the worst-case price of stability. On the other hand, cost-sharing methods 
that can leverage information about the network topology can outperform Shapley 
cost-sharing methods (Chen et al., 2007). 

Finally, for the scheduling games of Section 17.2.3, Christodoulou et al. (2004) and 
Immorlica et al. (2005) design machine scheduling policies to improve the inefficiency 
of equilibria. Informally, such a policy can be used to prioritize one player over another, 
thereby causing different players to incur different costs on a common machine. As 
shown in Christodoulou et al. (2004) and Immorlica et al. (2005), even very simple 
scheduling policies reduce the price of anarchy from logarithmic in the number of 
machines (Example 17.4) to a small constant. 
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Exercises 


Suppose that we modify Pigou’s example (Example 17.1) so that the lower edge 
has the cost function c(x) = x? for some d > 1. What is the price of anarchy of the 
resulting selfish routing network, as a function of a? 


Suppose we modify the 7, example (Example 17.3) so that all of the network edges 
are undirected. In other words, each player i can choose a path from s; to t that 
traverses each edge in either direction. What is the price of stability in the resulting 
Shapley network design game? 


Recall the scheduling game in Example 17.4, with m players and m machines. 
Prove that the price of anarchy of this game with respect to the utilitarian objective 
function is precisely 2 — 1/m. 

Let U; be a concave, strictly increasing, and continuously differentiable univariate 
function. Define the function Q; as in (17.2). Prove that Q; is continuously differ- 
entiable and strictly concave in 5; for every fixed nonnegative vector b_; with at 
least one strictly positive component. 


Prove that every equilibrium of a resource allocation game (Definition 17.5) has at 
least two strictly positive components. 


CHAPTER 18 


Routing Games 


Tim Roughgarden 


Abstract 


This chapter studies the inefficiency of equilibria in noncooperative routing games, in which self- 
interested players route traffic through a congested network. Our goals are threefold: to introduce 
the most important models and examples of routing games; to survey optimal bounds on the price of 
anarchy in these models; and to develop proof techniques that are useful for bounding the inefficiency 
of equilibria in a range of applications. 


18.1 Introduction 


A majority of the current literature on the inefficiency of equilibria concerns routing 
games. One reason for this popularity is that routing games shed light on an important 
practical problem: how to route traffic in a large communication network, such as the 
Internet, that has no central authority. The routing games studied in this chapter are 
relevant for networks with “source routing,” in which each end user chooses a full 
route for its traffic, and also for networks in which traffic is routed in a distributed, 
congestion-sensitive manner. Section 18.6 contains further details on these applications. 

This chapter focuses on two different models of routing games, although the in- 
efficiency of equilibria has been successfully quantified in a range of others (see 
Section 18.6). The first model, nonatomic selfish routing, is a natural generalization of 
Pigou’s example (Example 17.1) to more complex networks. The modifier “nonatomic” 
refers to the assumption that there are a very large number of players, each controlling 
a negligible fraction of the overall traffic. We also study atomic selfish routing, where 
each player controls a nonnegligible amount of traffic. We single out these two models 
for three reasons. First, both models are conceptually simple but quite general. Sec- 
ond, the price of anarchy is well understood in both of these models. Third, the two 
models are superficially similar, but different techniques are required to analyze the 
inefficiency of equilibria in each of them. 

The chapter proceeds as follows. Section 18.2 introduces nonatomic and atomic 
selfish routing games and explores several examples. Section 18.3 studies the existence 
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and uniqueness of equilibria in routing games. It also offers a glimpse of the potential 
function method, a technique that will be developed further in Chapter 19. Section 18.4 
proves tight upper bounds on the price of anarchy in nonatomic and atomic selfish 
routing games. Section 18.5 proposes two ways to reduce the price of anarchy in 
nonatomic selfish routing games. Section 18.6 concludes with bibliographic notes. 


18.2 Models and Examples 


18.2.1 Nonatomic Selfish Routing 


To introduce nonatomic selfish routing games, we recall the essential features of 
Pigou’s example (Example 17.1 and Figure 17.1). First, we are given a network 
describing the routes available to the players. In Pigou’s example, there are two parallel 
routes, each a single edge, that connect a source vertex s to a sink vertex t. Each edge 
has a cost that is a function of the amount of traffic that uses the edge. We assume that 
selfish players choose routes to minimize the cost incurred; in an equilibrium outcome, 
all players choose a path of minimum cost. In the equilibrium in Pigou’s example, all 
players choose the second edge, and the cost of this edge in this outcome is 1. 

More generally, a selfish routing game occurs in a multicommodity flow network, 
or simply a network. A network is given by a directed graph G = (V, E), with vertex 
set V and directed edge set F, together with a set (51, tf), ..., (Sx, t,) of source—sink 
vertex pairs. We also call such pairs commodities. Each player is identified with one 
commodity; note that different players can originate from different source vertices and 
travel to different sink vertices. We use P; to denote the s;—t; paths of a network. We 
consider only networks in which P; # @ for all i, and define P = U‘_,P;. We allow the 
graph G to contain parallel edges, and a vertex can participate in multiple source—sink 
pairs. 

We describe the routes chosen by players using a flow, which is simply a nonnegative 
vector indexed by the set P of source—sink paths. For a flow f and a path P € P;, 
we interpret fp as the amount of traffic of commodity i that chooses the path P to 
travel from s; to t;. Traffic is “inelastic,” in that there is a prescribed amount r; of traffic 
identified with each commodity i. A flow f is feasible for a vector r if it routes all of 
the traffic: for each i € {1,2,..., k}, dep, fp =1;. In particular, we do not impose 
explicit edge capacities. 

Finally, each edge e of a network has a cost function c. : R* > R*. We always 
assume that cost functions are nonnegative, continuous, and nondecreasing. All of 
these assumptions are reasonable in applications where cost represents a quantity that 
only increases with the network congestion; delay is one natural example. When we 
study the price of anarchy in Section 18.4, we also explore more severe assumptions 
on the network cost functions. We define a nonatomic selfish routing game, or simply 
a nonatomic instance, by a triple of the form (G, r, c). 

Next we formalize the notion of equilibrium in nonatomic selfish routing games. 
Define the cost of a path P with respect to a flow f as the sum of the costs of 
the constituent edges: cp(f) = )o.<p Ce(fe), where fe = > pep-cep fp denotes the 
amount of traffic using paths that contain the edge e. Since we expect selfish traffic to 
attempt to minimize its cost, we arrive at the following definition. 
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Definition 18.1 (Nonatomic equilibrium flow) Let f be a feasible flow for 
the nonatomic instance (G,r,c). The flow f is an equilibrium flow if, for every 
commodity i € {1,2,...,k} andevery pair P, P € P; of s;-+; paths with fp > 0, 


cp(f) < cp(f). 


In other words, all paths in use by an equilibrium flow f have minimum-possible 
cost (given their source, sink, and the congestion caused by /). In particular, all paths 
of a given commodity used by an equilibrium flow have equal cost. Section 18.3.1 
proves that every nonatomic instance admits at least one equilibrium flow, and that all 
equilibrium flows of a nonatomic instance have equal cost. 

In Pigou’s example, routing all of the traffic on the second link defines an equilibrium 
flow; only one path carries flow, and the only alternative has equal cost. Splitting the 
traffic equally between the two links defines a flow that is not an equilibrium flow; 
the first link carries a strictly positive amount of traffic and its cost is 1, but there is a 
strictly cheaper alternative (the second link, with cost 1/2). 


Remark 18.2 Our description of nonatomic selfish routing games and their 
equilibria does not parallel that of simultaneous-move games in Chapter 1. For 
example, we have not explicitly defined the set of players. While more general 
types of nonatomic games are frequently defined explicitly in terms of player 
sets, strategy profiles, and player payoff functions, selfish routing games possess 
special structure. In particular, the cost incurred by a player depends only on its 
path and the amount of flow on the edges of its path, rather than on the identities 
of any of the players. Games of this type are often called congestion games. 
Because of this structure, it is sufficient and simpler to work directly with flows 
in nonatomic selfish routing games. 


When we quantify the inefficiency of equilibrium flows in Section 18.4, we consider 
only the utilitarian objective of minimizing the total cost incurred by traffic. (Other 
objectives have been studied; see Section 18.6.) Precisely, since the cost incurred by a 
player choosing the path P in the flow f is cp(f), and fp denotes the amount of traffic 
choosing the path P, we define the cost of a flow f as 


CH=) eA fr. (18.1) 
Pep 
Expanding cp(f) as }° <p ce(f-) and reversing the order of summation in (18.1) gives 
a useful alternative definition of the cost of a flow: 


CHS) CAP) ie (18.2) 
ecE 
For an instance (G, 7, c), we call a feasible flow optimal if it minimizes the cost over 
all feasible flows. 

As in Chapter 17, the price of anarchy of a nonatomic selfish routing game, with 
respect to this objective, is the ratio between the cost of an equilibrium flow and that of 
an optimal flow. We can use the cost of an arbitrary equilibrium flow in lieu of that of 
a worst equilibrium flow (cf. Chapter 17), since all equilibrium flows of a nonatomic 
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c(x) = 1 


C(x) = xP 


Figure 18.1. A nonlinear variant of Pigou’s example (Example 18.3). 


instance have equal cost (Section 18.3.1). In Pigou’s example, the equilibrium flow 
routes all of the traffic on the second link and has cost 1. As we will see in Section 18.3.1, 
the optimal flow splits the traffic equally between the two links and has cost 3/4. The 
price of anarchy in Pigou’s example is therefore 4/3. 

We conclude this section with two more important examples of nonatomic selfish 
routing networks. 


Example 18.3 (Nonlinear Pigou’s example) The inefficiency of the equilib- 
rium flow in Pigou’s example can be amplified with a seemingly minor modifica- 
tion to the network. Suppose that we replace the previously linear cost function 
c(x) = x on the lower edge with the highly nonlinear one c(x) = x? for p large 
(Figure 18.1). As in Pigou’s example, the cost of the unique equilibrium flow 
is 1. The optimal flow routes a small € fraction of the traffic on the upper edge 
and has cost € + (1 — €)?*!, where € tends to 0 as p tends to infinity. Precisely, 
Section 18.3.1 shows that e = 1 — (p+ 1)7'/?. As p tends to infinity, the cost 
of the optimal flow approaches 0 and the price of anarchy grows without bound. 
Exercise 18.1 shows that this rate of growth is roughly p/In p as p > oo. 


While the price of anarchy in our final example is no larger than in Pigou’s example, 
it is arguably a more shocking display of the inefficiency of equilibria in selfish routing 
networks. 


Example 18.4 (Braess’s Paradox) Consider the four-node network shown in 
Figure 18.2(a). There are two disjoint routes from s to t, each with combined cost 
1+ x, where x is the amount of traffic that uses the route. Assume that there is 
one unit of traffic. In the equilibrium flow, the traffic is split evenly between the 
two routes, and all of the traffic experiences 3/2 units of cost. 

Now suppose that, in an effort to decrease the cost encountered by the traffic, 
we build a zero-cost edge connecting the midpoints of the two existing routes. 
The new network is shown in Figure 18.2(b). What is the new equilibrium flow? 

The previous equilibrium flow does not persist in the new network: the cost of 
the new route s > v > w — t is never worse than that along the two original 
paths, and it is strictly less whenever some traffic fails to use it. As a consequence, 
the unique equilibrium flow routes all of the traffic on the new route. Because of 
the ensuing heavy congestion on the edges (s, v) and (w, ft), all of the traffic now 
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w) 


(a) Initial network (b) Augmented network 


Figure 18.2. Braess’s Paradox. The addition of an intuitively helpful edge can adversely affect 
all of the traffic. 


experiences two units of cost. Braess’s Paradox thus shows that the intuitively 
helpful action of adding a new zero-cost edge can increase the cost experienced 
by all of the traffic! 


Braess’s Paradox also has remarkable analogues in several physical systems; see 
Section 18.6 for details. 

The optimal flow in the second network of Example 18.4 is the same as the equi- 
librium flow in the first network. The price of anarchy in the second network is 
therefore 4/3, the same as that in Pigou’s example. This is not entirely a coincidence; 
in Section 18.4.1 we prove that no nonatomic instance with cost functions of the form 
ax + b has a price of anarchy larger than 4/3. 

While this chapter does not explicitly study Braess’s Paradox, we obtain bounds on 
the worst-case severity of the paradox as a consequence of our results on the price of 
anarchy (Remark 18.22). 


18.2.2 Atomic Selfish Routing 


Anatomic selfish routing game or atomic instance is defined by the same ingredients as a 
nonatomic one: a directed graph G = (V, E), k source—sink pairs (5), t1),..., (Sx, te), 
a positive amount r; of traffic for each pair (s;, ¢;), and a nonnegative, continuous, 
nondecreasing cost function c, : Rt — Rt for each edge e. We also denote an atomic 
instance by a triple (G,r,c). The intuitive difference between a nonatomic and an 
atomic instance is that in the former, each commodity represents a large population of 
individuals, each of whom controls a negligible amount of traffic; in the latter, each 
commodity represents a single player who must route a significant amount of traffic on 
a single path. 

More formally, atomic instances are finite simultaneous-move games in the sense 
of Chapter 1. There are k players, one for each source—sink pair. Different players 
can have identical source—sink pairs. The strategy set of player i is the set P; of s;-t; 
paths, and if player i chooses the path P, then it routes its r; units of traffic on P. A 
flow is now a nonnegative vector indexed by players and paths, with f : ) denoting the 
amount of traffic that player i routes on the s;—-t; path P. A flow f is feasible for an 
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atomic instance if it corresponds to a strategy profile: for each player i, /, MY equals r; 
for exactly one s;—t; path and equals 0 for all other paths. The cost cp(f) of a path P 
with respect to a flow f and the cost C(f) of a flow f are defined as in Section 18.2.1. 

An equilibrium flow of an atomic selfish routing game is a feasible flow such that 
no player can strictly decrease its cost by choosing a different path for its traffic. 


Definition 18.5 (Atomic equilibrium flow) Let f be a feasible flow for the 
atomic instance (G, r,c). The flow f is an equilibrium flow if, for every player 
ie {1,2,...,k} and every pair P, P EP, of s;-t; paths with a > 0, 


ce(f) < cp(f), 
a(i) 


where f is the flow identical to f except that f y =Oand fs =7. 


We have defined equilibrium flows to correspond to pure-strategy Nash equilibria (see 
Chapter 1). Flows corresponding to mixed-strategy Nash equilibria have also been 
studied (see Section 18.6), but we will not consider them in this chapter. 

While the definitions of nonatomic and atomic instances are very similar, the two 
models are technically quite different. The next example illustrates two of these differ- 
ences. First, different equilibrium flows of an atomic instance can have different costs; 
as claimed in Section 18.2.1 and proved in Section 18.3.1, all equilibrium flows of a 
nonatomic instance have equal cost. Second, the price of anarchy in atomic instances 
can be larger than in their nonatomic counterparts. The following atomic instance has 
affine cost functions — of the form ax + b — and its price of anarchy is 5/2; in every 
nonatomic instance with affine cost functions, the price of anarchy is at most 4/3 
(Section 18.4.1). We call this the AAE example, after the initials of its discoverers (see 
Section 18.6). 


Example 18.6 (AAE example) Consider the bidirected triangle network shown 
in Figure 18.3. We assume that there are four players, each of whom needs to route 
one unit of traffic. The first two have source u and sinks v and w, respectively; 
the third has source v and sink w; and the fourth has source w and sink v. Each 
player has two strategies, a one-hop path and a two-hop path. In the optimal flow, 
all players route on their one-hop paths, and the cost of this flow is 4. This flow is 
also an equilibrium flow. On the other hand, if all players route on their two-hop 
paths, then we obtain a second equilibrium flow. Since the first two players each 
incur three units of cost and the last two players each incur two units of cost, this 
equilibrium flow has a cost of 10. The price of anarchy of this instance is therefore 
10/4 = 2.5. 


Exercise 18.2 explores variants of the AAE example. 

Next we study the even more basic issue of the existence of equilibrium flows. 
Recall that equilibrium flows for atomic instances correspond to pure-strategy Nash 
equilibria, which do not always exist in arbitrary finite games (see Chapter 1). Do 
they always exist in atomic selfish routing games? Our second example answers this 
question in the negative. 
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th t3 S4 


Figure 18.3. The AAE example (Example 18.6). In atomic instances with affine cost functions, 
different equilibrium flows can have different costs, and the price of anarchy can be as large 
as 5/2. 


Example 18.7 (Nonexistence in weighted atomic instances) Consider the net- 
work shown in Figure 18.4. Extend this network to an atomic selfish routing game 
by adding two players, both with source s and sink t, with traffic amounts 7; = 1 
and 72 = 2. 

We claim that there is no equilibrium flow in this atomic instance. To prove 
this, let P;, Po, P3, and Py denote the paths s>t,s->vu->t,s>w-t, 
ands > v > w — t,respectively. The following four statements then imply the 


claim. 

(1) If player 2 takes path P, or P), then the unique response by player | that minimizes 
its cost is the path Py. 

(2) If player 2 takes path P3 or P4, then the unique best response by player 1 is the 
path P). 

(3) If player 1 takes the path P4, then the unique best response by player 2 is the 
path P3. 

(4) If player 1 takes the path P;, then the unique best response by player 2 is the 
path P). 

We leave verification of (1)—(4) to the reader. 


On the other hand, Section 18.3.2 proves that every atomic instance in which all 
players route the same amount of traffic admits at least one equilibrium flow. We call 


Figure 18.4. An atomic instance with no equilibrium flow (Example 18.7). 
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instances of this type unweighted. Example 18.6 is an unweighted instance, while 
Example 18.7 is not. 


18.3 Existence, Uniqueness, and Potential Functions 


This section collects existence and uniqueness results about equilibrium flows in 
nonatomic and atomic selfish routing games. We also introduce the potential func- 
tion method, a fundamental proof technique. 


18.3.1 Nonatomic Selfish Routing: Existence and Uniqueness 


Our next goal is to show that in nonatomic selfish routing games, equilibrium flows 
always exist and are essentially unique. By “essentially unique,’ we mean that all 
equilibrium flows of a nonatomic instance have the same cost. In particular, the price 
of stability (Section 17.1) and the price of anarchy coincide in every nonatomic instance. 
Formally, our aim is to prove the following theorem. 


Theorem 18.8 (Existence and uniqueness of equilibrium flows) Let(G, r, c) 
be a nonatomic instance. 


(a) The instance (G, r, c) admits at least one equilibrium flow. 
(b) If f and f. are equilibrium flows for (G,r,c), then ce(fe) = cof.) for every 
edge e. 


Part (b) of the theorem and Definition 18.1 easily imply that two equilibrium flows of 
a nonatomic instance have equal cost. 

We prove Theorem 18.8 with the potential function method. The idea of this method 
is to exhibit a real-valued “potential function,” defined on the outcomes of a game, such 
that the equilibria of the game are precisely the outcomes that optimize the potential 
function. Potential functions are useful because they enable the application of optimiza- 
tion techniques to the study of equilibria. When a game admits a potential function, there 
are typically consequences for the existence, uniqueness, and inefficiency of equilibria. 

To motivate the potential functions corresponding to nonatomic selfish routing 
games, we present a characterization of optimal flows in such games. To state this char- 
acterization cleanly, we assume that for every edge e of the given nonatomic instance, 
the function x - ce(x) is continuously differentiable and convex. Note that x - c.(x) is 
the contribution to the social cost function (18.2) by traffic on the edge e. Let c3(x) = 
(x + ce(x))’ = ce(x) + x - ch(x) denote the marginal cost function for the edge e. For 
example, if c(x) denotes the cost function c(x) = ax? for some a, p > 0, then the cor- 
responding marginal cost function is c*(x) = (p + l)ax?. Let c3(f) = oe p CS) 
denote the sum of the marginal costs of the edges in the path P with respect to the 
flow f. The characterization follows. 


Proposition 18.9 (Characterization of optimal flows) Let (G,r,c) be a non- 
atomic instance such that, for every edge e, the function x -ce(x) is convex 
and continuously differentiable. Let c> denote the marginal cost function of the 
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edge e. Then f* is an optimal flow for (G, r, c) if and only if, for every commodity 
i € {1,2,...,k} and every pair P, P € P; of s;-t; paths with fz > 0, 


cof) < 5(f*). 


Proposition 18.9 follows immediately from the first-order conditions of a convex opti- 
mization problem with nonnegativity constraints. We omit the details and focus instead 
on how the proposition leads to a potential function for equilibrium flows in nonatomic 
instances, and on the implications of this potential function for the existence and 
uniqueness of equilibrium flows. 

Definition 18.1 and Proposition 18.9 immediately imply that equilibrium flows and 
optimal flows are the same thing, just with respect to different sets of cost functions. 


Corollary 18.10 (Equivalence of equilibrium and optimal flows) Let (G, r, c) 
be a nonatomic instance such that, for every edge e, the function x - ce(x) is convex 
and continuously differentiable. Let c} denote the marginal cost function of the 
edge e. Then f* is an optimal flow for (G, r, c) if and only if it is an equilibrium 
flow for (G, r, c*). 


For instance, in Pigou’s example (Example 17.1), the marginal cost functions of the 
two edges are c*(x) = 1 and c*(x) = 2x. The equilibrium flow with respect to the 
marginal cost functions splits the traffic equally between the two links, equalizing their 
marginal costs at 1; by Corollary 18.10, this flow is optimal in the original network. In 
the nonlinear variant of Pigou’s example (Example 18.3), the marginal cost functions 
are c*(x) = 1 and c*(x) = (p + 1)x?; the optimal flow therefore routes (p + 12 
units of traffic on the second link and the rest on the first. In Braess’s Paradox with the 
zero-cost edge added (Example 18.4 and Figure 18.2(b)), routing half of the traffic on 
each of the paths s > v > t ands — w — tf equalizes the marginal costs of all three 
paths at 2, and therefore provides an optimal flow. 

To construct a potential function for equilibrium flows, we need to “invert” Corol- 
lary 18.10: of what function do equilibrium flows arise as the global minima? The 
answer is simple: to recover Definition 18.1 as an optimality condition, we seek a 
function h,(x) for each edge e — playing the previous role of x - c.(x) — such that 
h)(x) = ce(x). Setting he(x) = se ce(y) dy for each edge e thus yields the desired po- 
tential function. Moreover, since c, is continuous and nondecreasing for every edge e, 
every function h, is both continuously differentiable and convex. 

Precisely, call 


fe 
O(f) = Ba) Ce(x) dx (18.3) 


ecE 


the potential function of a nonatomic instance (G, r, c). Invoking Proposition 18.9, 
with each function x - c.(x) replaced by h(x) = 1 c(y) dy, yields the same condition 
as in Definition 18.1; we have therefore characterized equilibrium flows as the global 
minimizers of the potential function ®. 
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Proposition 18.11 (Potential function for equilibrium flows) Let (G, r, c) be 
a nonatomic instance. A flow feasible for (G,r,c) is an equilibrium flow if and 
only if it is a global minimum of the corresponding potential function ® given 
in (18.3). 


Theorem 18.8 now follows from Proposition 18.11 and routine calculus. 


PROOF OF THEOREM 18.8 We first note that, by definition, the set of feasible 
flows of (G, r, c) can be identified with a compact (i.e., closed and bounded) sub- 
set of |P|-dimensional Euclidean space. Since edge cost functions are continuous, 
the potential function is a continuous function on this set. By Weierstrass’s Theo- 
rem from elementary mathematical analysis, the potential function ® achieves a 
minimum value on this set. By Proposition 18.11, every point at which © attains 
its minimum corresponds to an equilibrium flow of (G, 7, c). 

For part (b), recall that each cost function is nondecreasing, and hence each 
summand on the right-hand side of (18.3) is convex. Hence, the potential func- 
tion ® is a convex function. 

Now suppose that f and f are equilibrium flows for (G, r,c). By Proposi- 
tion 18.11, both f and f minimize the potential function &. We consider all 
convex combinations of f and f — that is, all vectors of the form Af + (1 — A)f 
for A € [0, 1]. All of these vectors are feasible flows. Since ® is a convex function, 
a chord between two points on its graph cannot pass below its graph. In algebraic 
terms, we have 


OAf + —Af) <rA@(f) + — Af) (18.4) 


for every A € [0, 1]. Since both f and f are global minima of ®, the inequal- 
ity (18.4) must hold with equality for all of their convex combinations. Since 
every summand of ® is convex, this can occur only if every summand ie Ce(y) dy 
is linear between the values f, and f,. In turn, this implies that every cost function 
Ce is constant between f, and f,. 


18.3.2 Atomic Selfish Routing: Existence 


We now consider equilibrium flows in atomic instances. The AAE example (Exam- 
ple 18.6) suggests that no interesting uniqueness results are possible in such instances, 
so we focus instead on the existence of equilibrium flows. Similarly, Example 18.7 
demonstrates that a general atomic instance need not admit an equilibrium flow. There 
are two approaches to circumventing this counterexample. The first, taken in this 
section, is to place additional restrictions on atomic instances so that equilibrium 
flows are guaranteed to exist. The second approach, discussed in Remark 18.26, 
is to relax the equilibrium concept so that an equilibrium exists in every atomic 
instance. 

The key result in this section is the following theorem, which establishes the exis- 
tence of equilibrium flows in atomic instances in which all players control the same 
amount of traffic. 


EXISTENCE, UNIQUENESS, AND POTENTIAL FUNCTIONS 471 


Theorem 18.12 (Equilibrium flows in unweighted atomic instances) Let 
(G,r,c) be an atomic instance in which every traffic amount r; is equal to a 
common positive value R. Then (G, r, c) admits at least one equilibrium flow. 


PROOF We obtain Theorem 18.12 by discretizing the potential function (18.3) 
for nonatomic instances and the proof of Theorem 18.8(a). Assume for simplicity 
that R = 1. Set 


Se 
t=) > ao) (18.5) 


ecE i=1 


for every feasible flow f. Note that ®, is the same as the pievious potential 
function © for nonatomic instances, except that the integral [5° c(x) dx has been 
replaced by the sum ~/*, c(i). 

Since the atomic instance (G, 7, c) has a finite number of players, and each of 
these has a finite number of strategies, there are only a finite number of possible 
flows. One of these, call it f, is a global minimum of the potential function ®,. 
We claim that f is an equilibrium flow for (G,7,c). To prove it, assume for 
contradiction that in f, the player i could strictly decrease its cost by deviating 
from the path P to the path P, yielding the new flow f. In other words, we assume 
that 


O>ca(f)-cr(f)= >) elfe+1)— >> elf). (18.6) 


e€P\P e€P\P 


On the other hand, consider the impact of player i’s deviation on the potential 
function ®,: for edges in P \ P, the corresponding sum in (18.5) acquires the 
extra term c,(f, + 1); for edges in P \ P, the corresponding sum sheds the term 
Ce( fe); and for edges of PN P, the corresponding sum remains the same. Thus, 
®,( f ) — ®,(f) is precisely the third expression of (18.6). Since this expression 
is negative, the potential function value of f is strictly less than that of f, which 
contradicts our choice of f. 


Remark 18.13 The proof of Theorem 18.12 establishes a remarkable property 
of the potential function ®,: it “tracks” the change in cost experienced by a 
deviating player. More formally, for every flow, every player, and every deviation 
by a player, the change in the player’s cost is identical to the change in the 
potential function. This property has consequences beyond the existence result 
of Theorem 18.12. For example, it implies that “best-response dynamics” are 
guaranteed to converge to an equilibrium flow. See Chapter 19 for further details. 


Remark 18.14 The proof of Theorem 18.12 did not use any assumptions about 
the edge cost functions. In particular, it is also valid when cost functions are 
not nondecreasing. This property will be crucial for some of the network design 
games studied in Chapter 19, which can be viewed as atomic selfish routing games 
with decreasing cost functions. 
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The next theorem guarantees the existence of equilibrium flows under a different 
restriction — affine cost functions. (Recall that a cost function c,(x) is affine if it has the 
form a,x + b,.; we always assume that a,, b, > 0.) 


Theorem 18.15 (Equilibrium flows with affine cost functions) Let (G,r,c) 
be an atomic instance with affine cost functions. Then (G, r, c) admits at least one 
equilibrium flow. 


The proof of Theorem 18.15 follows the same outline as that of Theorem 18.12, and 
uses a variant of the potential function method. See Exercise 18.4 for further details. 


18.4 The Price of Anarchy of Selfish Routing 


18.4.1 Nonatomic Selfish Routing: The Price of Anarchy 


This section gives an essentially complete analysis of the price of anarchy in nonatomic 
selfish routing games. As we know from the nonlinear variant of Pigou’s example 
(Example 18.3), the price of anarchy depends on “nonlinearity” of the network cost 
functions. Our goal is to show that it depends on nothing else — not the network size, the 
network structure, nor the number of commodities. More precisely, we show that for 
every conceivable restriction on the cost functions of a network, the price of anarchy is 
maximized (over all multicommodity networks) by the network that best “simulates” 
Pigou’s example and its nonlinear variants. 

As an aside, we note that the potential function characterization of nonatomic 
equilibrium flows (Proposition 18.11) already gives a good, but not optimal, upper 
bound on the price of anarchy. The intuitive explanation is simple: if equilibrium 
flows exactly optimize a potential function (18.3) that is a good approximation of the 
objective function (18.2), then they should also be approximately optimal. 


Theorem 18.16 (Potential function upper bound) Let(G, r, c) be anonatomic 
instance, and suppose that x -c(x) < y: de ce(y) dy for alle € E and x > 0. 
Then the price of anarchy of (G, r, c) is at most y. 


PROOF Let f and f* be equilibrium and optimal flows for (G, r, c), respectively. 
Since cost functions are nondecreasing, the cost of a flow (18.2) is always at least 
its potential function value (18.3). The hypothesis ensures that the cost of a flow 
is at most y times its potential function value. The theorem follows by writing 


CA<v- ONA<yv- Of) <yv-Cf”, 


with the second inequality following from Proposition 18.11. 


Theorem 18.16 implies that the price of anarchy of selfish routing is large only 
in networks with “highly nonlinear” cost functions. For example, if c. is a polyno- 
mial function with degree at most p and nonnegative coefficients, then x - c.(x) < 
(p+ 1) Ae Ce(y) dy for all x > 0. Theorem 18.16 then shows that the price of anarchy 
in nonatomic instances with such cost functions is at most linear in p. 
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Corollary 18.17 (Potential function bound for polynomials) Jf (G,r,c) is a 
nonatomic instance with cost functions that polynomials with nonnegative coef- 
ficients and degree at most p, then the price of anarchy of (G,r,c) is at most 
pri. 


This upper bound is nearly matched by Example 18.3, although the upper and lower 
bounds differ by roughly a In p multiplicative factor (Exercise 18.1). We close this 
gap using a different and important proof technique, which is driven by variational 
inequalities. 

We first formalize a natural lower bound on the price of anarchy based on “Pigou-like 
examples.” 


Definition 18.18 (Pigou bound) Let C be anonempty set of cost functions. The 
Pigou bound a(C) for C is 


r-c(r) 


a(C) = sup sup 


, (18.7) 
ceC x,r>0 X * c(x) 7 (r a x)c(r) 


with the understanding that 0/0 = 1. 


The point of the Pigou bound is that it lower bounds the price of anarchy in instances 
with cost functions in C. 


Proposition 18.19 Let C be a set of cost functions that contains all of the 
constant cost functions. Then the price of anarchy in nonatomic instances with 
cost functions in C can be arbitrarily close to a(C). 


PROOF Fix a choice of c € C and x,r > 0. We can complete the proof by 
exhibiting a selfish routing network with cost functions in C and price of anarchy 
at least c(r)r/[c(x)x + (7 — x)c(r)]. Since c is nondecreasing, this expression is 
at most 1 if x > r; we can therefore assume that x < r. 

Let G be a two-vertex, two-edge network as in Figure 18.1. Give the lower 
edge the cost function c;(y) = c(y) and the upper edge the constant cost function 
c2(y) = c(r). By assumption, both of these cost functions lie in C. Set the traffic 
rate to r. Routing all of the traffic on the lower edge yields an equilibrium flow 
with cost c(r)r. Routing x units of traffic on the lower edge and r — x units of 
traffic on the upper edge gives a feasible flow with cost [c(x)x + (r — x)c(r)]. 
The price of anarchy in this instance is thus at least c(r)r/[c(x)x + (r — x)c(r)], 
as desired. 


While Proposition 18.19 assumes that the set C includes all of the constant cost func- 
tions, its conclusion holds whenever C is inhomogeneous in the sense that c(0) > 0 for 
some c € C (Exercise 18.5). 

We next show that, even though the Pigou bound is based only on Pigou-like 
examples, it is also an upper bound on the price of anarchy in general multicommodity 
flow networks. The proof requires the following variational inequality characterization 
of equilibrium flows. 
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Proposition 18.20 (Variational inequality characterization) Let f be a fea- 
sible flow for the nonatomic instance (G,r,c). The flow f is an equilibrium flow 
if and only if 


WaGies ee 


ecE ecE 
for every flow f* feasible for (G, r, c). 


PROOF Fix f and define the function H, on the set of feasible flows by 


k 
BPs ys oie =>. Crs 


i=1 PEP; ecE 


the same reversal of sums used to prove the equivalence of (18.1) and (18.2) 
shows that these two definitions of H;(f*) agree. The value H;(f*) denotes the 
cost of a flow f* after the cost function of each edge e has been changed to the 
constant function everywhere equal to c.( fe). By the second definition of H;, the 
proposition is equivalent to the assertion that a flow f is an equilibrium flow if 
and only if it minimizes H;(-) over all feasible flows. 

Examining the first definition of Hy shows that a flow f* minimizes Hy if 
and only if, for every commodity i, f> > 0 only for paths P that minimize cp(f) 
over all s;—-t; paths. Since the flow f satisfies this condition if and only if it is an 
equilibrium flow, the proof is complete. 


We now show that the Pigou bound is tight. 


Theorem 18.21 (Tightness of the Pigou bound) Let C be a set of cost func- 
tions and a(C) the Pigou bound for C. If (G,r,c) is a nonatomic instance with 
cost functions in C, then the price of anarchy of (G, r, c) is at most a(C). 


PROOF Let f* and f be optimal and equilibrium flows, respectively, for a 
nonatomic instance (G, r, c) with cost functions in the set C. The theorem follows 
by writing 


CHS y aoe 


ecE 


1 
> — J cel fe) fe + \ (FE = fedeel fe) 


7 a(C) ecE ecE 

C(f) 

a(C) ° 

where the first inequality follows from Definition 18.18, applied to each edge 


e with x = f* and r= f., and the second inequality follows from Proposi- 
tion 18.20. 


= 


Proposition 18.19 and Theorem 18.21 show that, for essentially every fixed restric- 
tion on the allowable cost functions, the price of anarchy is maximized by Pigou-like 
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examples. Determining the largest-possible price of anarchy in Pigou-like examples 
(i.e., the Pigou bound) is a tractable problem in many cases. For example, it is pre- 
cisely 4/3 when C is the set of affine cost functions (Exercise 18.6), and more generally 
is [1 — p-(p+1)-+)/?]-! & p/In p when C is the set of polynomials with degree 
at most p and nonnegative coefficients. In these cases, the maximum price of anarchy 
(among all multicommodity instances) is achieved by the instances in Examples 17.1 
and 18.3. The Pigou bound is also known for several other classes of cost functions; 
see Section 18.6 for references. 


Remark 18.22 (Bounds on Braess’s Paradox) Braess’s Paradox (Example 
18.4) shows that adding edges to a network can increase the cost of its equi- 
librium flow. Since the equilibrium flow in the original network is a candidate for 
the optimal flow in the second network, the ratio between the costs of the new and 
original equilibrium flows is a lower bound on the price of anarchy in the latter 
network. 

On the other hand, Theorem 18.21 and Exercise 18.6 show that the price of 
anarchy is at most 4/3 in every network with affine cost functions. Thus, adding 
edges to a network with affine cost functions cannot increase the cost of its 
equilibrium flow by more than a 4/3 factor. Example 18.4 is therefore a worst- 
case manifestation of Braess’s Paradox in networks with affine cost functions. 
Similar bounds also apply to the physical analogues of Braess’s Paradox that are 
described in Section 18.6. 


18.4.2 Atomic Selfish Routing: The Price of Anarchy 


We now consider atomic selfish routing games. We again obtain tight bounds on the 
price of anarchy, at least for polynomial cost functions, but the discrete nature of atomic 
instances complicates the analysis. 

We first note that the potential function method, which gave nontrivial bounds on the 
price of anarchy for nonatomic instances (Theorem 18.16), cannot be used for atomic 
instances. The difficulty stems from the non-uniqueness of equilibrium flows in atomic 
instances (Example 18.6). Recall that a bound on the price of anarchy is a guarantee 
that a// equilibrium flows of an instance are nearly optimal. Reviewing the proof of 
Theorem 18.16, we observe that the potential function method argues about only one 
equilibrium flow — the one with minimum potential function value. As a result, the 
potential function method is directly useful only for bounding the price of stability 
rather than the price of anarchy. While these two quantities coincide in nonatomic 
selfish routing games, they are generally different in atomic ones. (See Section 18.6 
for results on the price of stability in atomic selfish routing games.) 

We instead rely on proof techniques that are partially inspired by the variational 
inequality of Proposition 18.20. This inequality expresses the fact that equilibrium 
flows route all traffic on shortest paths, with respect to the induced edge costs. We derive 
a similar, if more complicated, condition for atomic instances. To keep the proofs as 
transparent as possible, we focus on atomic instances with affine cost functions. Recall 
from Theorem 18.15 that every such instance admits at least one equilibrium flow. The 
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analysis can also be extended to other cost functions and other equilibrium concepts; 
see Remark 18.26 and Section 18.6 for more details. 
Our goal is the following theorem. 


Theorem 18.23 (The price of anarchy in affine atomic instances) /f (G, r, c) 
is an atomic instance with affine cost functions, then the price of anarchy of 
(G, r, c) is at most (3 + V5) /2 © 2.618. 


A variant of the AAE example (Example 18.6) shows that the upper bound in Theo- 
rem 18.23 is the best possible if different players can control different amounts of flow 
(Exercise 18.2(a)). If all of the players control the same amount of flow, then a variant 
of the following proof gives an improved upper bound of 5/2, which matches the lower 
bound furnished by the AAE example (Exercise 18.7). 

We build up to Theorem 18.23 in a sequence of steps. We begin with a lemma that 
follows immediately from the definition of an equilibrium flow. 


Lemma 18.24 (Equilibrium condition) Let (G, r, c) be an atomic instance in 
which each edge e has an affine cost function c,(x) = a.x + b, with a,, be = 0. 
Let f and f* be equilibrium and optimal flows, respectively, for (G,r,c). Let 
player i use the path P; in f and the path P* in f*. Then 


Y Iae fe + bel < > [ae fe +1i) + bel. (18.8) 


e€P; ecP* 


Our second step is to combine the inequalities of Lemma 18.24 — one per player — 
to relate the cost of an arbitrary equilibrium flow to that of an optimal flow. 


Lemma 18.25 (Equilibrium inequality) With the same assumptions and nota- 
tion as in Lemma 18.24, 


CSCS) + > aeheft. (18.9) 


ecE 
PROOF For each player i, multiply the inequality (18.8) by 7;. Summing up the 
resulting k inequalities, we obtain 


k 


CIP) < Yori | Yo acl fe +11) + be 


i=l ecP* 
k 
Sore ee ee: 
i=l ee P* 
=) [atfe+ 6) +5.) ft 
ecE 


where the equality follows by reversing the order of summation. Since the final 
expression equals the right-hand side of (18.9), the proof is complete. 
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To complete the proof of Theorem 18.23, we upper bound the magnitude of the 
“error term” in (18.9) relative to the costs of the equilibrium and optimal flows. 


PROOF OF THEOREM 18.23 Let f and f* denote equilibrium and optimal 
flows, respectively, for the atomic instance (G,7r,c). Assume that edge e has 
the cost function c,(x) = a,x + b, for ae, be = 0. Apply the Cauchy—Schwarz 
Inequality to the vectors {,/a¢ fejece and {./ae fs }ecr to obtain 


yo eet SS ade ae SCD CP. 


ecE ecE ecE 


Combining this with the Equilibrium Inequality (18.9), dividing through 
by C(f*), and rearranging gives 


cH. [aw 

C(f*) = WeECEAY. 
Squaring both sides and solving the corresponding quadratic inequality x? — 3x + 
1 < 0, we find that 


Cif) 34/5 


~ 2.618, 
Cif) 2 


as claimed. 


Theorem 18.23 can be extended to atomic instances with cost functions that are 
polynomials with nonnegative coefficients and degree at most a parameter p. However, 
the upper bound on the price of anarchy increases with p roughly in proportion to the 
exponential function p” — much faster than in nonatomic instances. This exponential 
dependence is not an artifact of the above proof approach, as nearly matching lower 
bounds on the price of anarchy are known (Section 18.6). 


Remark 18.26 Strictly speaking, the price of anarchy is not always defined in 
general atomic instances, where equilibrium flows need not exist (Example 18.7). 
Nevertheless, Theorem 18.23 has been extended to atomic instances with poly- 
nomial cost functions in three different ways. First, when such an instance does 
admit at least one equilibrium flow, then all such flows have cost at most p?”? 
times that of an optimal flow. Second, by Nash’s Theorem (Chapters 1 and 2), 
every such instance admits a mixed-strategy Nash equilibrium, and the expected 
cost of every such equilibrium is at most p?”) times that of an optimal flow. Fi- 
nally, similar upper bounds have been proved for “sink equilibria,” an equilibrium 
concept that always exists in finite games and is motivated by convergence issues. 
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18.5 Reducing the Price of Anarchy 


As we have seen, the price of anarchy can be large in both nonatomic and atomic selfish 
routing games when cost functions are highly nonlinear. This motivates a question first 
posed in Section 17.3: how can we design or modify a selfish routing network, without 
explicitly imposing an optimal solution, to minimize the inefficiency of its equilibria? 
Can modest intervention significantly reduce the price of anarchy? We briefly discuss 
two techniques for mitigating the inefficiency of selfish routing in nonatomic instances: 
influencing traffic with edge taxes (Subsection 18.5.1) and increasing the capacity of 
the network (Subsection 18.5.2). 


18.5.1 Marginal Cost Pricing 


Our first approach to reducing the price of anarchy in nonatomic selfish routing games 
is to use marginal cost taxes on the edges of the network. The idea of marginal cost 
pricing is to charge each network user on each edge for the additional cost its presence 
causes for the other users of the edge. To discuss this idea formally, we allow each edge 
e of a nonatomic selfish routing network to possess a nonnegative tax T,. We denote a 
nonatomic instance (G, r, c) with edge taxes t by (G, 7, c + T). An equilibrium flow for 
such an instance (G, r, c + T) is defined as in Definition 18.1, with all traffic traveling 
on routes that minimize the sum of the edge costs and edge taxes. Equivalently, it 
is an equilibrium flow for the nonatomic instance (G, 7, c*), where the cost function 
c; is a shifted version of the original cost function ce: c3(x) = ce(x) + te for all 
x >0. 

The principle of marginal cost pricing asserts that for a flow f feasible for a 
nonatomic instance (G, r, c), the tax T, assigned to the edge e should be t, = f. - c,( fe), 
where c’, denotes the derivative of c.. (Assume for simplicity that the cost functions 
are differentiable.) The term c/,( f..) corresponds to the marginal increase in cost caused 
by one user of the edge, and the term f, is the amount of traffic that suffers from 
this increase. We can also interpret the marginal cost tax t. using Corollary 18.10: 
Te is precisely the “extra term” in the marginal cost function that is absent from the 
original cost function. These taxes correct for the failure of selfish users to account for 
the second, “altruistic” term of the marginal cost function. Formally, Corollary 18.10 
easily implies the following guarantee. 


Theorem 18.27 = Let (G, r, c) be a nonatomic instance such that, for every edge 
e, the function x - Ce(x) is convex and continuously differentiable. Let f* be an 
optimal flow for (G,r,c) and let te = f* + ci(f) denote the marginal cost tax 
for edge e with respect f*. Then f* is an equilibrium flow for (G, r, c + T). 


Marginal cost taxes thus induce an optimal flow as an equilibrium flow; in this 
sense, such taxes reduce the price of anarchy to 1. Theorem 18.27 also holds with 
weaker assumptions on the cost functions; in particular, the convexity hypothesis 
is not needed. For further discussion of pricing problems in routing games, see 
Chapter 22. 
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18.5.2 Capacity Augmentation 


Our final result is a novel type of bound on the inefficiency of equilibrium flows in 
nonatomic selfish routing games with arbitrary cost functions. This bound does not 
involve the price of anarchy, which is unbounded in such networks (Example 18.3), 
and instead shows that the cost of an equilibrium flow is at most that of an optimal 
flow that is forced to route twice as much traffic between each source—sink pair. As we 
will see, this result implies that in lieu of centralized control, the inefficiency of selfish 
routing can be offset by a moderate increase in link speed. 


Example 18.28 Consider the nonlinear variant of Pigou’s example (Exam- 
ple 18.3). When there is one unit of traffic, the equilibrium flow routes all of the 
flow on the lower edge, while the optimal flow routes € units of flow on the upper 
edge and the rest on the lower edge (where € — 0 as p — oo). When the amount 
r of traffic to be routed exceeds one, an optimal flow assigns the additional r — 1 
units of traffic to the upper link, incurring a cost that tends to r — 1 as p > oo. 
In particular, for every p an optimal flow feasible for twice the original traffic 
amount (r = 2) has cost at least 1, the cost of the equilibrium flow in the original 
instance. 


We now show that the upper bound stated in Example 18.28 for the nonlinear variant 
of Pigou’s example holds in every nonatomic instance. 


Theorem 18.29 = /f f is an equilibrium flow for (G,r,c) and f* is feasible for 
(G, 2r, c), then C(f) < C(f*). 


PROOF Let f and f* denote an equilibrium flow for (G, r, c) and a feasible flow 
for (G, 2r, c), respectively. For each commodity i, let d; denote the minimum cost 
of an s;—t; path with respect to the flow f. Definition 18.1 and the definition of 
cost (18.1) imply that C(f) = )°; ridj. 

The key idea is to define a set of cost functions ¢ that satisfies two properties: 
lower bounding the cost of f* relative to that of f is easy with respect to ¢; and 
the new cost functions ¢ approximate the original ones c. Specifically, we set 
G(x) = max{c,( fe), Ce(x)} for each edge e. Let C(-) denote the cost of a flow in 
the instance (G, r, 2). Note that C( f*) > C(f*) while C(f) = C(/). 

We first upper bound the amount by which the new cost C(f*) of f* can 
exceed its original cost C( f*). For every edge e, G(x) — ce(x) is zero for x > fe 
and bounded above by ce( fe) for x < fe, SO x(Ee(x) — Ce(X)) < Ce( fe) fe for all 
x > 0. Thus 


CP) — CUP) = So AGL — oF) < Yo cel fe) fe = CUP). (18.10) 


ecE ecE 


In other words, evaluating f* with cost functions ¢, rather than c, increases its 
cost by at most an additive C(f) factor. 

Now we lower bound C(f*). By construction, the modified cost @,(-) of an edge 
e is always at least c.(f.), so the modified cost ¢p(-) of a path P € P; is always at 
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least cp(f), which in turn is at least d;. The modified cost C( f*) therefore equals 


k k 
Dee Fe = oD fp =D) 2ridi =2C(f). (18.11) 


PeP i=l PeP,; i=1 


The theorem now follows immediately from inequalities (18.10) and (18.11). 


Another interpretation of Theorem 18.29 is that the benefit of centralized control is 
equaled or exceeded by the benefit of a sufficient improvement in link technology. 


Corollary 18.30 Let (G, r, c) be a nonatomic instance and define the modified 
cost function & by G(x) = Ce(x/2)/2 for each edge e. Let f be an equilibrium 
flow for (G,r,@) with cost C(f), and f* a feasible flow for (G,r,c) with cost 
C(f*). Then C(f) < Cf. 


Simple calculations show that Theorem 18.29 and Corollary 18.30 are equivalent; see 
Exercise 18.8(a). 

Corollary 18.30 takes on a particularly nice form in instances in which all cost 
functions are M/M/I delay functions. Such a cost function has the form c.(x) = (ue — 
x)~!, where u, can be interpreted as an edge capacity or a queue service rate; the 
function is defined to be +00 when x > u,. (Rigorously allowing infinite costs in 
this selfish routing model requires some care; we ignore these issues in this chapter.) 
In this case, the modified function ¢, of Corollary 18.30 is €.(x) = 1/2(u, — x/2) = 
1/Qu, — x). Corollary 18.30 thus suggests the following design principle for selfish 
routing networks with M/M/1 delay functions: to outperform optimal routing, just 
double the capacity of every edge. 


18.6 Notes 


18.6.1 Nonatomic Selfish Routing 


Nonatomic selfish routing was first studied in the context of transportation networks. 
Pigou (1920) informally discussed Pigou’s example in his 1920 book, The Economics 
of Welfare, in order to illustrate the inefficiency of equilibria. He also anticipated 
the principle of marginal cost pricing discussed in Theorem 18.27; indeed, marginal 
cost taxes are sometimes called Pigouvian taxes. The model was first formally de- 
fined by Wardrop (1952). For this reason, equilibrium flows in nonatomic selfish 
routing games are often called Wardrop equilibria. We use the term “equilibrium 
flow” so that the terminology for nonatomic and atomic selfish routing games is the 
same. 

Beckmann et al. (1956) proved a number of fundamental results for the nonatomic 
model. Theorem 18.8, Proposition 18.9, Corollary 18.10, Proposition 18.11, and 
Theorem 18.27 were first proved in Beckmann et al. (1956), via proofs essentially 
identical to the ones given here. Details on first-order conditions for convex pro- 
gramming problems can be found in Bertsekas (1999, Chapter 2). Schmeidler (1973) 
founded the theory of general noncooperative nonatomic games. 
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Two decades after nonatomic selfish routing games were first defined, researchers 
began to use them to model the routing of data through communication networks. 
Nonatomic selfish routing is immediately relevant for networks that employ so-called 
source routing, meaning that each sender is responsible for selecting a full path of links 
to the receiver. Assuming that senders seek paths of minimum cost, senders of data in 
such networks correspond to the users of a selfish routing network. 

In large networks such as the Internet, distributed shortest-path routing is typically 
used instead of source routing. In distributed shortest-path routing, each link is given 
a positive Jength, and data are forwarded along a path of minimum total length to its 
destination. Shortest-path routing leaves a key parameter unspecified: the length of 
each edge. A direct correspondence between selfish routing and shortest-path routing 
exists if and only if the edge cost functions coincide with the lengths used to define 
shortest paths. In other words, when an x fraction of the overall network traffic is 
using an edge with cost function c(-), then the corresponding shortest-path routing 
algorithm should define the length of the edge as the number c(x). If the cost function 
c is nonconstant, then this is a congestion-dependent definition of the edge length. 
In this case, shortest-path routing will route traffic exactly as if it is a network with 
selfish routing (or source routing). For details on this equivalence, see the textbook by 
Bertsekas and Tsitsiklis (1989). See Qiu et al. (2003), for example, for a more recent 
paper that studies selfish routing from a computer networking perspective. 

Braess’s Paradox was discovered by Braess (1968). The variant in Example 18.4 was 
noted by L. Schulman (personal communication, October 1999). For surveys on the 
large literature inspired by Braess’s Paradox, see Roughgarden (2006) and D. Braess’s 
home page (Braess, 2007). 

Cohen and Horowitz (1991) noted that Braess’s Paradox has startling analogues in 
physical systems. For instance, Example 18.4 can be simulated in the following system 
of strings and springs. One end of a spring is attached to a fixed support, and the other 
end to a very short string. A second identical spring is hung from the free end of the 
string and carries a heavy weight. Finally, strings are connected, with very little slack, 
from the support to the upper end of the second spring and from the lower end of the 
first spring to the weight. Assuming that the springs are ideally elastic, the stretched 
length of a spring is a linear function of the force applied to it. We can therefore view 
the network of strings and springs as a selfish routing game, where force corresponds to 
traffic and physical distance corresponds to cost. Remarkably, severing the very short 
taut string causes the weight to levitate away from the ground! The rise in the weight is 
the same as the improvement in the equilibrium flow obtained by deleting the zero-cost 
edge of Figure 18.2(b) to recover the network of Figure 18.2(a). 

The price of anarchy in nonatomic selfish routing games was first studied by 
Roughgarden and Tardos (2002). The nonlinear variant of Pigou’s example (Exam- 
ple 18.3) is from Roughgarden and Tardos (2002), as is Theorem 18.16. Roughgarden 
and Tardos (2002) also proved the special case of Theorem 18.21 for networks with 
affine cost functions (where the price of anarchy is at most 4/3). Roughgarden (2003) 
introduced the Pigou bound and proved Theorem 18.21 under the same convexity 
hypothesis used in Theorem 18.9. The solution to Exercise 18.5 can also be found 
in Roughgarden (2003). A. Ronen (personal communication, March 2002) suggested 
using the variational inequality in Proposition 18.20, which was first proved by Smith 
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(1979). Correa et al. (2004) proved Theorem 18.21 without any convexity assump- 
tions. This theorem has been generalized to wider classes of nonatomic games; see 
Roughgarden (2005a) for a survey, as well as a discussion of the price of anarchy of 
nonatomic selfish routing games with nonutilitarian objectives. 

Finally, Theorem 18.29 is due to Roughgarden and Tardos (2002). A proof of 
Corollary 18.30 and a counterexample to Theorem 18.29 in atomic instances can be 
found in Roughgarden (2005a). For extensions of Theorem 18.29 to networks with 
restricted cost functions, including a solution to Exercise 18.8(e), see Chakrabarty 
(2004) and Correa et al. (2005). 


18.6.2 Atomic Selfish Routing 


Atomic selfish routing games were first considered by Rosenthal (1973), who proved 
Theorem 18.12 with the potential function method. Rosenthal also introduced the con- 
cept of “congestion games” (Remark 18.2). Monderer and Shapley (1996) undertook a 
more general study of “potential games” — games that admit a potential function, which 
in turn can be used to prove that best-response dynamics converge to an equilibrium 
(Remark 18.13). Potential games are now studied in their own right; see Voorneveld 
et al. (1999) and Roughgarden (2005a, Section 4.8) for surveys of this literature. 

Rosenthal (1973) showed that equilibrium flows need not exist in weighted multi- 
commodity atomic instances. Example 18.7 is due to Goemans et al. (2005). Fotakis 
et al. (2005) proved Theorem 18.15 for weighted instances with affine cost functions. 

The price of anarchy of atomic instances was first studied by Suri et al. (2007) in the 
context of the asymmetric scheduling games described in Exercise 18.3 below. Among 
other results, they proved an upper bound of 5/2 on the price of anarchy in such games 
when each player controls one unit of traffic and when all cost functions are affine. This 
paper also introduced the proof structure used to prove Theorem 18.23 in this chapter. 

Awerbuch et al. (2005) significantly generalized the results in Suri et al. (2007). 
They proved Theorem 18.23, as well as the refinement discussed in Exercise 18.7. The 
AAE example and the variant in Exercise 18.2(a) are from Awerbuch et al. (2005), 
as are the exponential (in the degree bound p) upper and lower bounds on the price 
of anarchy for polynomial cost functions with nonnegative coefficients. For refined 
versions of these upper and lower bounds, see Olver (2006). Awerbuch et al. (2005) 
extended all of their upper bounds to mixed-strategy Nash equilibria. Goemans et al. 
(2005) extended the upper bounds to “sink equilibria,” a notion of equilibrium that is 
motivated by best-response dynamics and that always exists in finite noncooperative 
games. 

For unweighted instances and pure-strategy equilibrium flows, the results in 
Awerbuch et al. (2005) were obtained independently by Christodoulou and Koutsoupias 
(2005b). The proofs in Christodoulou and Koutsoupias (2005b) extend without much 
difficulty to weighted instances and mixed-strategy Nash equilibria. Christodoulou and 
Koutsoupias (2005b) also studied the price of anarchy with respect to the egalitarian 
objective (see Section 17.1) and provide solutions to parts (b) and (c) of Exercise 18.2. 

Caragiannis et al. (2006) provide a solution to Exercise 18.3(b), as well as numerous 
other results about the price of anarchy and stability in different classes of asymmetric 
scheduling instances. For results on the price of stability in atomic selfish routing 
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games, see Anshelevich et al. (2004), Christodoulou and Koutsoupias (2005a), and 
Caragiannis et al. (2006). 

Finally, several researchers have studied selfish routing in the atomic splittable 
model. This model is similar to the atomic selfish routing games studied in this chapter; 
the key difference is that a player i is permitted to route its 7; units of traffic fractionally 
over the s;—t; paths of the network. This model is also different from nonatomic selfish 
routing games; for example, if there is only one player controlling all of the traffic in 
the network, then the player will minimize its cost by routing this traffic optimally. 
More generally, a player takes into account the congestion it causes for its own traffic, 
while ignoring the congestion it creates for other players. 

Equilibrium flows in the atomic splittable model can behave in counterintuitive ways 
(see Exercise 18.9, taken from Catoni and Pallottino, 1991), and the price of anarchy 
in this model is not well understood. It was initially claimed that the upper bounds 
on the price of anarchy for nonatomic instances carry over to atomic splittable ones 
(Roughgarden, 2005b; Correa et al., 2005), but Cominetti et al. (2006) recently gave 
counterexamples to these claims in multicommodity networks. Obtaining tight bounds 
on the price of anarchy in this model remains an important open question. 
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Exercises 


18.1 Recall the nonlinear variant of Pigou’s example (Example 18.3). Prove that as the 
degree p of the cost function of the second link tends to infinity, the price of 
anarchy tends to infinity as p/In p. 


18.2 This exercise explores lower bounds on the price of anarchy in atomic selfish 
routing games with affine cost functions. 


(a) Modify the players’ weights in the AAE example (Example 18.6) so that the price 
of anarchy in the resulting weighted atomic instance is precisely (3 + /5)/2 ~ 
2.618. 

(b) Can you devise an unweighted atomic instance with 3 players, affine cost 
functions, and price of anarchy equal to 5/2? Can you achieve a price of 
anarchy of (3 + /5)/2 using 3 players and variable weights? 


18.3 


18.4 


18.5 


18.6 


18.7 


18.8 


18.9 
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(c) What is the largest price of anarchy in an atomic instance with affine cost 
functions and only 2 players? 


An asymmetric scheduling instance differs from an atomic selfish routing instance in 
the following two respects. First, the underlying network is restricted to a common 
source vertex s, a common sink vertex t, and a set of parallel links that connect s 
to t. On the other hand, we allow different players to possess different strategy sets: 
each player / has a prescribed subset 5; of the links that it is permitted to use. 


(a) Show that every asymmetric scheduling instance is equivalent to an atomic 
selfish routing game. Your reduction should make use only of the cost functions 
of the original scheduling instance, plus possibly the all-zero cost function. 

(b) [Difficult] Part (a) shows that the worst-case price of anarchy in asymmetric 
scheduling instances with affine cost functions is at most that in atomic selfish 
routing games with affine cost functions. Prove that the worst-case price of 
anarchy is the same in the two models, equal to 5/2 in unweighted instances 
and (3 + /5)/2 in weighted instances. 


Prove Theorem 18.15. Make use of the following potential function: 


®(f)= Do | col fe) fo +) Coltri |, 


ecF 1ESe 


where S. denotes the set of players that choose a path in f that includes the edge e. 


A set C of cost functions is inhomogeneous if it contains at least one function 
c satisfying c(0) > 0. Extend Proposition 18.19 to inhomogeneous sets of cost 
functions. 


[Hint: Simulate a Pigou-like example using a more complex network and cost 
functions drawn only from the given set C.] 


Prove that if C is the set of nonnegative, nondecreasing, concave cost functions, 
then the Pigou bound a(C) equals 4/3. 


Improve the upper bound of Theorem 18.23 for unweighted atomic instances 
with affine cost functions. Can you match the lower bound provided by the AAE 
example? 


This exercise studies refinements and extensions of Theorem 18.29. 


(a) Deduce Corollary 18.30 from Theorem 18.29. 

(b) Show that Theorem 18.29 does not always hold in atomic selfish routing games. 

(c) Suppose we define f* to be a flow feasible for the instance (G, (1 + 4)r, ©), 
where 6 > 0 is a parameter. (In Theorem 18.29, 5 = 1.) How does the guarantee 
of Theorem 18.29 change? 

(d) Use Example 18.3 to prove that your bound in part (c) is the best possible. 

(e) Determine the smallest value of 6 such that the following statement is true: for 
every nonatomic instance (G, r, c) with affine cost functions, for every equilib- 
rium flow f for (G, r,c) and optimal flow f* for (G, (1 + 8)r, 0), C(f) < C(f*). 
(Theorem 18.29 implies that the statement holds with 6 = 1; the question is 
whether or not our restriction on the cost functions permits smaller values of 3.) 


Recall the atomic splittable selfish routing model discussed at the end of 
Section 18.6. Given such a game, we can obtain a new game by replacing a 
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player that routes r; units of traffic from s; to t; by two players that each route r; /2 
units of traffic from s; to t;. This operation does not change the cost of an optimal 
flow. Intuitively, since it decreases the amount of cooperation in the network, it 
should only increase the cost of an equilibrium flow. Prove that this intuition is 
incorrect: in multicommodity atomic splittable selfish routing networks, splitting a 
player in two can decrease the price of anarchy. 


CHAPTER 19 


Network Formation Games and 
the Potential Function Method 


Eva Tardos and Tom Wexler 


Abstract 


Large computer networks such as the Internet are built, operated, and used by a large number of 
diverse and competitive entities. In light of these competing forces, it is surprising how efficient 
these networks are. An exciting challenge in the area of algorithmic game theory is to understand 
the success of these networks in game theoretic terms: what principles of interaction lead selfish 
participants to form such efficient networks? 

In this chapter we present a number of network formation games. We focus on simple games that 
have been analyzed in terms of the efficiency loss that results from selfishness. We also highlight a 
fundamental technique used in analyzing inefficiency in many games: the potential function method. 


19.1 Introduction 


The design and operation of many large computer networks, such as the Internet, are 
carried out by a large number of independent service providers (Autonomous Systems), 
all of whom seek to selfishly optimize the quality and cost of their own operation. 
Game theory provides a natural framework for modeling such selfish interests and 
the networks they generate. These models in turn facilitate a quantitative study of the 
trade-off between efficiency and stability in network formation. In this chapter, we 
consider a range of simple network formation games that model distinct ways in which 
selfish agents might create and evaluate networks. All of the models we present aim 
to capture two competing issues: players want to minimize the expenses they incur in 
building a network, but at the same time seek to ensure that this network provides them 
with a high quality of service. 

There are many measures by which players might evaluate the quality of a network. 
In this chapter, we focus primarily on measures of distance (Section 19.2) and con- 
nectivity (Section 19.3), rather than measures based on congestion effects (as is done 
in Chapter 18). We also assume that players have financial considerations. In Sections 
19.2 and 19.3, players seek to minimize the construction costs of the networks they 
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create. In Section 19.4, we look at a game with a more sophisticated financial aspect: 
players represent service providers who set prices for users and seek to maximize their 
profit, namely their income from users minus the cost of providing the service. 

For all of the games we consider, we use Nash equilibrium as the solution concept, 
and refer to networks corresponding to these equilibria as being stable. The models 
we focus on involve players who can unilaterally build edges, and thus the Nash 
equilibrium solution concept is appropriate. 

To evaluate the overall quality of a network, we consider the social cost, or the sum 
of all players’ costs. We refer to the networks that optimize social cost as optimal or 
socially efficient. The main goal of this chapter is to better understand the quantitative 
trade-off between networks that are stable and those that are socially efficient. More 
precisely, we are interested in bounding the price of anarchy and the price of stability (as 
defined in Chapter 17). The models we consider in this chapter are network formation 
games in which these measures are provably small. 

In Section 19.2 we consider a local connection game where the nodes of the graph 
are players who pay for the edges that connect them directly to other nodes (incident 
edges). In selecting a strategy, players face two conflicting desires: to pay as little as 
possible, and to have short paths to all other nodes. Our goal here is to bound the 
efficiency loss resulting from stability. Such connection games have been extensively 
studied in the economics literature (see Jackson (2006) for a survey) to model social 
network formation, using edges to represent social relations. The local connection 
game can also be thought of as a simple model for the way subnetworks connect in 
computer networks (by establishing peering points), or as modeling the formation of 
subnetworks in overlay systems such as P2P (peer-to-peer) networks connecting users 
to each other for downloading files. 

We will use a model in which players can form edges to a neighbor unilaterally, 
and will use Nash equilibrium as our solution concept. This differs from much of the 
literature in economics, where it is typically assumed that an edge between two players 
needs the consent or contribution from both players, and where the notion of pairwise 
stability is used instead of Nash equilibria. We will discuss how the results in Section 
19.2 extend to models using pairwise stable equilibria in the notes in Section 19.5.1. 

The model we examine was introduced by Fabrikant et al. (2003) and represents 
the first quantitative effort to understand the efficiency loss of stable networks. In this 
game, a single parameter @ represents the cost of building any one edge. Each player 
(represented by a node) perceives the quality of a network as the sum of distances to 
all other nodes. Players aim to minimize a cost function that combines both network 
quality and building costs: they attempt to minimize the sum the building costs they 
incur and the distances to all other players. Thus, players use @ as a trade-off parameter 
between their two objectives. This is perhaps the simplest way to model this type of 
trade-off. While the simplicity of this game makes it easy to evaluate, such a stylized 
model ignores a number of issues, such as varying costs and possible congestion effects. 
In Section 19.5.1, we discuss related models that address some of these issues. 

In Section 19.3 we study a very different (and also quite simple) model of network 
design, introduced by Anshelevich et al. (2004), called the global connection game. 
Whereas players in the game of Section 19.2 only make local choices (which other nodes 
to link to), players in this game make global decisions, in that they may build edges 
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throughout the network. Unlike the local connection game, this global game attempts 
to model players who actually build and maintain large-scale shared networks. This 
model also allows for greater heterogeneity in the underlying graph. 

In the global connection game, a player is not associated with an individual node of 
the networks, but instead has certain global connectivity goals. To achieve these goals, 
a player may contribute money to any set of edges in the network. As before, we view 
connectivity as the primary measure of quality. However, players do not desire uniform 
connectivity; instead, each player has a subset of nodes that it needs to connect, and 
aims to do so as cheaply as possible. Furthermore, unlike in the local game, players are 
not concerned with distance, and simply want to connect their terminals. 

As in the previous model, players are sensitive to costs. Edge e has a cost c, > 0, 
and players who use e share this cost. In particular, we focus on a fair sharing rule; 
all players using an edge must share its cost evenly. This natural cost-sharing scheme 
can be derived from the Shapley value, and has many nice properties. We also examine 
other cost-sharing games, and discuss the role of fair sharing in the price of stability 
results. 

A key technique used in this section is the potential function method. This method 
has emerged as a general technique in understanding the quality of equilibria. We 
review this technique in detail in Section 19.3.2. While this technique provides results 
only regarding the price of stability, it is interesting to note that many of the currently 
known price of anarchy results (e.g., most of the results in Part III of this book) are for 
potential games. 

In Section 19.4, we consider another potential game; a facility location game with a 
more sophisticated cost model. In the previous two sections, players simply minimized 
their costs. Here, edges still have costs, but players also select prices for users so as 
to maximize net income: price charged minus the cost paid. We again consider a very 
simplified model in which players place facilities to serve clients, thereby forming 
a network between the providers and the clients. We show that a socially efficient 
network is stable (i.e., the price of stability is 1), and bound the price of anarchy. 

In the context of facility location games, we also bound the quality of solutions 
obtained after sufficiently long selfish play, without assuming that players have yet 
reached an equilibrium. As we have seen in part I of this book, equilibrium solutions 
may be hard to find (Chapter 2), and natural game play may not converge to an 
equilibrium (Chapter 4). Thus it is often useful to evaluate the quality of the transient 
solutions that arise during competitive play. The facility location game considered in 
this section is one of the few classes of games for which this strong type of bound is 
known. 


19.2 The Local Connection Game 


In this section we consider the simple network formation game of Fabrikant et al. 
(2003), where players can form links to other players. We consider a game with n 
players, where each player is identified with a node. Node u may choose to build 
edges from u to any subset of nodes, thereby creating a network. Players have two 
competing goals; players want to build (and thus pay) for as few edges as possible, yet 
they also want to form a network that minimizes the distance from their own node to 
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all others. Our main focus in this section is to quantitatively understand the inefficiency 
that results from the selfish behavior of these network builders. 


19.2.1 Model 


Players in the local connection game are identified with nodes in a graph G on which 
the network is to be built. A strategy for player u is a set of undirected edges that u 
will build, all of which have u as one endpoint. Given a strategy vector S, the set of 
edges in the union of all players’ strategies forms a network G(S) on the player nodes. 
Let dists(u, v) be the shortest path (in terms of number of edges) between u and v in 
G(S). We use dist(u, v) when S is clear from context. The cost of building an edge is 
specified by a single parameter, w. Each player seeks to make the distances to all other 
nodes small, and to pay as little as possible. More precisely, player u’s objective is to 
minimize the sum of costs and distances an, + >.,, dist(u, v), where n, is the number 
of edges bought by player wu. 

Observe that since edges are undirected, when a node u buys an edge (u, v), that 
edge is also available for use from v to u, and in particular, is available for node v. 
Thus, at Nash equilibrium at most one of the nodes u and v pay for the connecting 
edge (u,v). Also, since the distance dist(u, v) is infinite whenever u and v are not 
connected, at equilibrium we must have a connected graph. We say that a network 
G =(V, E) is stable for a value @ if there is a stable strategy vector S that forms G. 

The social cost of a network G is SC(G) = Seas dist(u, v) + a|E|, the sum of 
players’ costs. Note that the distance dist(u, v) contributes to the overall quality twice 
(once for u and once for v). We will be comparing solutions that are stable to those 
that are optimal under this measure. 


19.2.2 Characterization of Solutions and the Price of Stability 


We now characterize the structure of an optimal solution as a function of w. A network 
is optimal or efficient if it minimizes the social cost SC(G). 


Lemma 19.1) [fa > 2 then any star is an optimal solution, and if a <2 then 
the complete graph is an optimal solution. 


PROOF Consider an optimal solution G with m edges. We know m > n — 1; 
otherwise, the graph would be disconnected, and thus have an infinite cost. All 
ordered pairs of nodes not directly connected by an edge must have a distance 
of at least 2 from each other, and there are n(n — 1) — 2m such pairs. Adding 
the remaining 2m pairs with distance 1 yields am + 2n(n — 1) —-4m+2m = 
(a@ — 2)m + 2n(n — 1) as a lower bound on the social cost of G. Both a star and 
the complete graph match this bound. Social cost is minimized by making m as 
small as possible when a > 2 (a star) and as large as possible when a < 2 (a 
complete graph). 


Both the star and the complete graph can also be obtained as a Nash equilibrium for 
certain values of a, as shown in the following lemma. 
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Lemma 19.2. [fa > 1 then any star is a Nash equilibrium, and if a <1 then 
the complete graph is a Nash equilibrium. 


PROOF First suppose a > 1, and consider a star. It turns out that any assignment 
of edges to incident players corresponds to a Nash equilibrium, but for this result, 
we need only demonstrate a single solution. In particular, consider the strategy in 
which player 1 (the center of the star) buys all edges to the other players, while the 
remaining n — | leaf players buy nothing. Player 1 has no incentive to deviate, as 
doing so disconnects the graph and thus incurs an infinite penalty. Any leaf player 
can deviate only by adding edges. For any leaf player, adding k edges saves k 
in distance but costs wk, and thus is not a profitable deviation. Thus the star is a 
Nash equilibrium. 

Now suppose a < 1. Consider a complete graph, with each edge assigned to 
an incident player. A player who stops paying for a set of k edges saves awk in 
cost, but increases total distances by k, so this outcome is stable. 


There are other equilibria as well, some of which are less efficient (see Exercise 
19.6). However, these particular Nash equilibria, in conjunction with the above optimal 
solutions, suffice to upper bound the price of stability. 


Theorem 19.3) [fa > 2 ora <1, the price of stability is 1. For 1 <a < 2, the 
price of stability is at most 4/3. 


PROOF The statements about a < | and a > 2 are immediate from Lemmas 
19.1 and 19.2. When 1 < @ < 2, the star is a Nash equilibrium, while the optimum 
structure is acomplete graph. To establish the price of stability, we need to compute 
the ratio of costs of these two solutions. The worst case for this ratio occurs when 
a approaches 1, where it attains a value of 


2nn—1)-2@-1)  4n*?-6n+2 
2n(n—1)—n(n—1)/2.— 3n? —3n 


< 4/3. 


Exercise 19.3 shows that the complete graph is the unique equilibrium for a < 1, 
so we also have that the price of anarchy is 1 in this range. We now address the price 
of anarchy for larger values of a. 


19.2.3. The Price of Anarchy 


The first bound on the price of anarchy for this game was given by Fabrikant et al. 
(2003), and involves two steps: bounding the diameter of the resulting graph, and using 
the diameter to bound the cost. We begin with the second step. 


Lemma 19.4 [fa graph G at Nash equilibrium has diameter d, then its social 
cost is at most O(d) times the minimum possible cost. 
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PROOF The cost of the optimal solution is at least Q(an + n’), as we need to buy 
a connected graph, which costs at least (n — 1)a, and there are Q(n) distances, 
each of which is at least 1. To bound the quality of the solution, consider the 
distance costs and edge costs separately. The distance cost is at most n?d, and 
thus is at most d times the minimum possible. 

We now examine edge costs. First we consider cut edges, those edges whose 
removal disconnects G. There are at most n — 1 cut edges, so the total cost of 
all cut edges is at most a(n — 1), which in turn is at most the optimal solution 
cost. Now consider the set of all noncut edges paid for by a vertex v. We will 
argue that there are O(nd/a) such edges, with cost O(dn) for node v, and thus 
the total cost of all noncut edges is O(dn7). This will establish that the cost of G 
is O(an + dn’), completing the proof. 

Pick a node u, and for each edge e = (u, v) paid for by node u, let V, be the 
set of nodes w, where the shortest path from u to w goes through edge e. We 
will argue that the distance between nodes u and v with edge e deleted is at most 
2d. Thus deleting e increases the total distance from u to all other nodes by at 
most 2d|V.|. Since deleting the edge would save a@ in edge costs and G is stable, 
we must have that a < 2d|V.|, and hence |V.| > a@/2d. If there are at least a/2d 
nodes in each V,, then the number of such edges adjacent to a node v must be at 
most 2dn/a, as claimed. 

We now bound the distance between nodes u and v with edge e deleted. 
Consider Figure 19.1, depicting a shortest path avoiding edge e. Let e’ = (u’, v’) 
be the edge on this path entering the set V.. The segment P, of this path from u 
to node wu’ is the shortest path from u to u' as u’ ¢ V,, and hence deleting e does 
not affect the shortest path. So P,, is at most d long. The segment P, from v’ to v 
is at most d — | long, as P, U e forms the shortest path between u and v’. Thus 
the total length is at most 2d. O 


Using this lemma, we can bound the price of anarchy by O(./@). 


Theorem 19.5 The diameter of a Nash equilibrium is at most 2,/a, and hence 
the price of anarchy is at most O(./@). 


PROOF From Lemma 19.4, we need only prove that for any nodes wu and v, 
dist(u, v) < 2,/a. Suppose for nodes u and v, dist(u, v) > 2k, for some k. The 


Figure 19.1. Path P,,, (u’, uv’) P, is the u-v shortest path after edge e = (u, v) is deleted. 
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Figure 19.2. Nodes u and v that are at maximum distance d apart. B is the set of nodes at 
most d’ = (d — 1)/4 away from node u, and A,, is the set of nodes whose shortest path leaves 
Bat w. 


main observation is that by adding the edge (u, v), the node u would pay a and 
improve her distance to the nodes on the second half of the u — v shortest path 
by (2k — 1) + (2k —3)+---+1=R’. So if dist(u, v) > 2./a, node u would 
benefit from adding the edge (u, v), a contradiction. 


We now show an O(1) bound on the price of anarchy that was given by Lin (2003) 
(and independently also by Albers et al., 2006) for a = O(./n). 


Theorem 19.6 The price of anarchy is O(1) whenever a is O(./n). More gen- 
erally, price of anarchy is OA + a/./n). 


PROOF We again use Lemma 19.4, so all we have to do is improve our bound 
on the diameter d. Consider nodes u and v with dist(u, v) = d. Let d’ = |(d — 
1)/4] and let B be the set of nodes at most d’ away from u, as shown on 
Figure 19.2. Consider how the distance d(v, w) changes for nodes w € B by 
adding edge (v, u). Before adding the edge dist(v, w) > d —d'. After adding 
(v, u), the distance decreases to at most d’ + 1. Thus v saves at least (d — 2d’ — 1) 
in distance to all nodes in B, and hence would save at least (d — 2d’ — 1)|B| = 
(d — 1)|B|/2 in total distance costs by buying edge (v, wu). If G is stable, we must 
have (d — 1)|B|/2 <a. 

For a node w € B let A,, contain all nodes ¢ for which the u-t shortest path 
leaves the set B after the node w. Note that if A,, is nonempty, then w must 
be exactly at distance d’ from u. Therefore, node u would save |A,|(d’ — 1) 
in distance cost by buying edge (u, w). If the network is at equilibrium, then 
we must have that |A,,|(d’ — 1) < a. There must be a node w € B that has 
|Aw| = (n — |B|)/|B|. Combining these, we get that 


(d’ —1)(n — |B))/|B| <@. 
This implies that |B|(1 + a/(d’ — 1)) > n, and sincea > d > d’, 
|B| => n(d' — 1)/2a. 
Combining this with the previous bound of a > (d — 1)|B|/2 yields 
a > (d —1)|B|/2 > (d— In(d' — 1)/4a > n(d' — 1)? /a. 
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Thus a? > n(d' — 1)? andhenced < 4(d' +1) + 1 < 4a/,/n + 9, which implies 
the claimed bound by Lemma 19.4. 


19.3 Potential Games and a Global Connection Game 


In this section we introduce a broad class of games known as potential games. This class 
encompasses a number of natural and well-studied network-based games. As we will 
see, potential games possess many nice properties; pure equilibria always exist, best 
response dynamics are guaranteed to converge, and the price of stability can be bounded 
using a technique called the potential function method. Our motivating example for 
this class of games is a network formation game called the global connection game, 
which was discussed in Chapter 17. We begin by defining this game, and present some 
theorems about pure equilibria and the price of stability. We then introduce potential 
games, and provide generalized results for this broader framework. 

The network formation game discussed in Section 19.2 is local in the sense that 
a player can build links to other nodes, but has no direct means for affecting distant 
network structure. Such might be the case with social networks or peering relationships 
in a digital network. The global connection game, in contrast, models players who 
make global structural decisions; players may build edges throughout the network, and 
thus consider relatively complex strategies. This game might be more appropriate for 
modeling the actual construction and maintenance of large-scale physical networks. 

Beyond the varying scope of players’ strategies, there are two additional features 
that differentiate these network formation games. First, in exchange for the global 
connection game’s broader strategy space, we consider a relatively simplified player 
objective function. In particular, we assume that players are unconcerned with their 
distance to other nodes in the network, and instead want only to build a network that 
connects their terminals as cheaply as possible. The second notable distinction is that 
the global connection game supports cooperation, in that multiple players may share 
the cost of building mutually beneficial links. In the local connection game, an edge 
might benefit multiple players, and yet the edge’s cost is always covered fully by one 
of the two incident players. We now give a formal description of the global connection 
game. 


19.3.1 A Global Connection Game 


We are given a directed graph G = (V, E) with nonnegative edge costs c, for all edges 
e € E. There are k players, and each player i has a specified source node s; and sink 
node t; (the same node may be a source or a sink for multiple players). Player i’s goal 
is to build a network in which ¢; is reachable from s;, while paying as little as possible 
to do so. A strategy for player i is a path P; from s; to t; in G. By choosing P;, player 
i is committing to help build all edges along P; in the final network. Given a strategy 
for each player, we define the constructed network to be U; P;. 

It remains to allocate the cost of each edge in this network to the players using it, 
as this will allow players to evaluate the utility of each strategy. In principle, there are 
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a vast number of possible cost-sharing mechanisms, each of which induces a distinct 
network formation game. We will briefly touch on this large space of games at the 
end of the section, but for now, our primary focus will be on a single cost-sharing 
mechanism with a number of nice properties, that is both simple and easy to motivate. 

In particular, we consider the mechanism that splits the cost of an edge evenly among 
all players whose path contains it. More concretely, if k, denotes the number of players 
whose path contains edge e, then e assigns a cost share of c,/k, to each player using 
e. Thus the total cost incurred by player 7 under a strategy vector S is given by 


cost;(S) = > ce/ ke. 


ecP; 


Note that the total cost assigned to all players is exactly the cost of the constructed 
network. This equal-division mechanism was suggested by Herzog et al. (1997), and 
has a number of basic economic motivations. Moulin and Shenker prove that this 
mechanism can be derived from the Shapley (2001) value, and it can be shown to 
be the unique cost-sharing scheme satisfying a number of natural sets of axioms (see 
Feigenbaum et al., 2001; Moulin and Shenker, 2001). We refer to it as the fair or 
Shapley cost-sharing mechanism. The social objective for this game is simply the cost 
of the constructed network. 

One may view this game as a competitive version of the generalized Steiner tree 
problem; given a graph and pairs of terminals, find the cheapest possible network 
connecting all terminal pairs. Indeed, an optimal generalized Steiner tree is precisely 
the outcome against which we will compare stable solutions in evaluating the efficiency 
of equilibria. This connection highlights an important difference between this game 
and routing games; in routing games such as those discussed in Chapter 18, players 
are sensitive to congestion effects, and thus seek sparsely used paths. But in the global 
connection game, as with the Steiner forest problem, the objective is simply to minimize 
costs, and thus sharing edges is in fact encouraged. 

The two examples in Chapter 17 provide a few useful observations about this game. 
Example 17.2 (see Figure 19.3(a)) shows that even on very simple networks, this game 
has multiple equilibria, and that these equilibria may differ dramatically in quality. 
There are two equilibria with costs k and 1 respectively. Since the latter is also optimal 


Figure 19.3. An instance of the global connection game with price of anarchy k (a) and an 
instance with price of stability 7, (b). 
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solution, the price of anarchy is k, while the price of stability is 1. It is not hard to show 
that the price of anarchy can never exceed k on any network (see Exercise 19.9), and 
thus this simple example captures the worst-case price of anarchy. Our primary goal 
will be to bound the price of stability in general. 

Example 17.3 (see Figure 19.3(b)) shows that the price of stability can indeed 
exceed 1; this network has a unique Nash equilibrium with cost 7/;, the kth harmonic 
number, while the optimal solution has a cost of 1 + €. Thus, the price of stability 
on this network is roughly 7/;,. Our aim is to prove that pure equilibria always exist 
and provide an upper bound the price of stability. Both of these results make use of a 
potential function, which we will formally introduce in Section 19.3.2. 

Consider an instance of the global connection game, and a strategy vector S = 
(P|, P2,..., Py) containing an s;—t; path for each player i. For each edge e, define a 
function Y,(.S) mapping strategy vectors to real values as 


W.(S) =Ce° Hk.» 


where k, is the number of players using edge e in S, and Hy = ey 1/j is the 
kth harmonic number. Let ¥(S) = }°, -(S). While this function does not obviously 


capture any important feature of our game, it has the following nice property. 


Lemma 19.7 Let S = (Pi, Po,..., Py), let P/ £ P; be an alternate path for 
some player i, and define a new strategy vector S' = (S_;, P/). Then 


W(S) — W(S') = uj(S) — uj(S). 


PROOF This lemma states that when a player i changes strategies, the corre- 
sponding change in Y(-) exactly mirrors the change in i’s utility. Let k, be the 
number of players using e under S. For any edge e that appears in both or neither 
of P; and P/, the cost paid by i toward e is the same under S and S$’. Likewise, 
W.(-) has the same value under S and S’. For an edge e in P; but not in P’, by 
moving from S to S’, i saves (and thus increases her utility by) c./k., which 
is precisely the decrease in ,(-). Similarly, for an edge e in P’ but not in P;, 
player i incurs a cost of c./(ke + 1) in switching from S to S’, which matches 
the increase in W,(-). Since W(-) is simply the sum of W,(-) over all edges, the 
collective change in player i’s utility is exactly the negation of the change in 
W(). 


We also note that YS) is closely related to cost(S), the cost of the network generated 
by S. More precisely, consider any edge e used by S. The function ,(S) is at least c. 
(any used edge is selected by at least 1 player), and no more than Hc, (there are only 
k players). Thus we have 


Lemma 19.8 cost(S) < W(S) < Hy,cost($). 


These two lemmas are used to prove the following two theorems, which will follow 
from Theorems 19.11, 19.12, and 19.13. 
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Theorem 19.9 Any instance of the global connection game has a pure Nash 
equilibrium, and best response dynamics always converges. 


Theorem 19.10 = The price of stability in the global connection game with k 
players is at most Hx, the kth harmonic number. 


Since the proofs of these two results actually apply to a much broader class of games 
(i.e., potential games), we now introduce these games and prove the corresponding 
results in this more general context. 


19.3.2 Potential Games and Congestion Games 


For any finite game, an exact potential function ® is a function that maps every 
strategy vector S to some real value and satisfies the following condition: If S = 
(S|, S2,..., S,), S; A S; is an alternate strategy for some player i, and S’ = (S_;, S/), 
then &(S) — &(S’) = u;(S’) — u;(S). In other words, if the current game state is S, and 
player i switches from strategy S; to strategy S‘, then the resulting savings 7 incurs 
exactly matches the decrease in the value of the potential function. Thus Lemma 19.7 
simply states that W is an exact potential function for the global connection game. 

It is not hard to see that a game has at most one potential function, modulo addition 
by a constant. A game that does possess an exact potential function is called an exact 
potential game. For the remainder of this chapter, we will drop the word “exact” 
from these terms (see Exercise 19.13 for an inexact notion of a potential function). A 
surprising number of interesting games turn out to be potential games, and this structure 
has a number of strong implications for the existence of and convergence to equilibria. 


Theorem 19.11 Every potential game has at least one pure Nash equilibrium, 
namely the strategy S that minimizes ®(S). 


PROOF Let ® be a potential function for this game, and let S be a pure strategy 
vector minimizing ®(S). Consider any move by a player i that results in a new 
strategy vector S’. By assumption, ®(S’) > ®(S), and by the definition of a po- 
tential function, u;(S’) — u;(S) = ®(S) — ®(S’). Thus i’s utility can not increase 
from this move, and hence S is stable. 


Going one step further, note that any state S with the property that © cannot be 
decreased by altering any one strategy in S is a Nash equilibrium by the same argument. 
Furthermore, best response dynamics simulate local search on ®; improving moves 
for players decrease the value of the potential function. Together, these observations 
imply the following result. 


Theorem 19.12 Jn any finite potential game, best response dynamics always 
converge to a Nash equilibrium. 


Note that these two results imply Theorem 19.9. 
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A less abstract characterization of potential games can be found in a class of games 
called congestion games (Rosenthal, 1973). A congestion game has k players and n 
resources. Player i has a set S; of allowable strategies, each of which specifies a subset 
of resources. Each resource j has a load-dependent cost function c;(x), indicating the 
cost incurred by any player i whose chosen strategy includes resource j if there are x 
such players in total. The total cost charged to player i who chooses a strategy S; is 
simply the sum of the costs incurred from each resource in S;. Thus if the total load 
on link j is x;, then i pays )> jes, Cj %))- The Global Connection game is clearly a 
congestion game; edges are resources, s; — t; paths are allowable strategies for player 
i, and the cost functions are c,(x) = C./Xx. 

Rosenthal (1973) proved that any congestion game is a potential game (see Exercise 
19.15). Monderer and Shapley (1996) proved the converse; for any potential game, 
there is a congestion game with the same potential function. 

We now present a generic upper bound on the price of stability for an arbitrary 
potential game. 


19.3.3. The Potential Function Method and the Price of Stability 


Suppose that we have a potential game G with a potential function ®(S) and social cost 
function c(S). If ®CS) and c(S) are similar, then the price of stability must be small. 
We make this precise in the following theorem. 


Theorem 19.13 Suppose that we have a potential game with potential function 
®, and assume further that for any outcome S, we have 


costs) < ®(S) < B - cost(S) 


for some constants A, B > 0. Then the price of stability is at most AB. 


PROOF Let S” be a strategy vector that minimizes ®(S). From Theorem 19.11, 
S% is a Nash equilibrium. It suffices to show that the actual cost of this solution 
is not much larger than that of a solution S* of minimal cost. By assumption, 
we have that cost(S") < (S$). By the definition of 5S‘, we have that (S$) < 
®(S*). Finally, the second inequality of our assumption implies that ®(S*) < B- 
cost(S*). Stringing these inequalities together yields cost(S%) < AB - cost(S*), 
as desired. 


Note that this result, taken together with Lemma 19.8, directly implies Theorem 
19.10. This technique for bounding the price of stability using a potential function is 
known as the potential function method. 

In general, outcomes that minimize the potential function may not be the best Nash 
equilibrium, and thus this bound is not always tight (see Exercise 19.14). However, 
in the case of the global connection game, we have seen that the price of stability is 
at least 7;,. Thus, for this class of games, the bound given by the potential function 
method is the best possible. 
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Notice that we have essentially already seen the potential function method used in 
the nonatomic selfish routing game of Chapter 18. For this routing game, all equilibria 
have the same social value, and hence the price of anarchy and the price of stability 
are the same. Because of this, Theorem 18.16 is phrased as a statement about the price 
of anarchy, but we can still view this result as an application the potential function 
method. In the last section of this chapter, we will see yet another application of this 
technique for a potential game that models competitive facility location. 

We have seen that potential games have pure equilibria, and that the price of stability 
can be bounded via the potential function method. We now consider the complexity of 
finding these equilibria in general potential games. 


19.3.4 Finding Nash Equilibria in Potential Games 


Theorem 19.12 provides an algorithmic means of reaching pure equilibria in potential 
games. Unfortunately, this theorem makes no claim regarding the rate of this con- 
vergence. In some games, best response dynamics always converges quickly, but in 
many games it does not. In some games, the potential function ® can be minimized in 
polynomial time, but in others the minimization problem is NP-hard. To get a better 
handle on the complexity of finding pure equilibria in potential games, we consider the 
closely related problem of finding local optima in optimization problems. 

The class of Polynomial Local Search problems (PLS) was defined by Johnson 
et al. (1988) as an abstract class of local optimization problems. First, let us define a 
general optimization problem (say a minimization problem) as follows. We have a set 
of instances J, and for each instance x € / a set of feasible solutions F(x) and a cost 
function c,(s) defined on all s € F(x). We also have an oracle (or a polynomial-time 
algorithm) that takes an instance x and a candidate solution s, and checks whether s 
is a feasible solution (s € F(x)). If it is, the oracle computes the cost of that solution, 
Cx(s). The optimization problem is to find a solution s € F(x) with minimum cost c,(s) 
for a given instance x € [. 

To define a local optimization problem, we must also specify a neighborhood 
N,(s) C F(x) for each instance x € I andeach solution s € F(x). A solutions € F(x) 
is locally optimal if c,(s) < c,(s’) for all s’ € N,(s). The local optimization problem is 
to find a local optimum s € F(x) for a given instance x € J. A local optimization prob- 
lem is in PLS if we have an oracle that, for any instance x € J and solution s € F(x), 
decides whether s is locally optimal, and if not, returns s’ € N,(s) with c,(s’) < c,(s). 

Fabrikant et al. (2004) show that finding a Nash equilibrium in potential games 
is PLS-complete, assuming that the best response of each player can be found in 
polynomial time. To see that the problem belongs to PLS, we will say that the neighbors 
N,(s) of a strategy vector s are all the strategy vectors s’ that can be obtained from 
s by a single player changing his or her strategy. By definition, a potential function 
® is locally optimal for cost function c,(s) = ®(s) if and only if it is a pure Nash 
equilibrium, so finding a pure Nash equilibrium is in PLS. 

A problem is PLS-complete if it is in PLS and there is a polynomial time reduction 
from all other problems in PLS such that local optima of the target problem correspond 
to local optima of the original one. Since the introduction of this class in Johnson et al. 
(1988), many local search problems have been shown to be PLS-complete, including the 
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weighted versions of satisfiability (Krentel, 1989). The weighted satisfiability problem 
is defined by a formula in conjunctive normal form C; A... A Cy, with a nonnegative 
weight w, for each clause C;. Solutions s are truth assignments of variables, and the 
associated cost c(s) is the sum of the weights of the unsatisfied clauses. The neighbors 
of a truth assignment s are the assignments obtained by flipping a single variable in s. 

Here we show via a reduction from this weighted satisfiability problem that finding 
a pure Nash equilibrium in potential games is PLS complete. 


Theorem 19.14 Finding a pure Nash equilibrium in potential games, where 
best response can be computed in polynomial time, is PLS complete. 


PROOF We have argued that finding a pure Nash equilibrium in such games is 
in PLS. To see that the problem is PLS complete, we use a reduction from the 
weighted satisfiability problem. Consider a weighted satisfiability instance with 
k variables x,,..., xz, and n clauses C,,...,C, with weight w,; for clause C;. 
Our congestion game will have one player for each variable, and one resource 
for each clause. Player i, associated with variable x;, has two possible strategies: 
it can either select the set of resources S$; consisting of all clauses that contain 
the term x;, or S;, which includes all clauses containing the term x;. Selecting S; 
corresponds to setting x; to false, while selecting S; corresponds to setting x; to 
true. 

The main observation is that a clause C; with k; literals is false if and only if 
the corresponding element has congestion k;. Let C; be a clause with k; literals 
and weight w;. We define the congestion cost of the element j corresponding 
to the clause C; as cj(€) = O if € < k; and cj(k;) = wj. For the strategy vector 
corresponding to the truth assignment s, the potential function has value ®(s) = 
>; ¢i(&;), where &; is the number of false literals in C;. The weight of assignment 
sis exactly ®(s), and thus the equilibria of this game are precisely the local optima 
of the satisfiability problem. 


19.3.5 Variations on Sharing in the Global Connection Game 


We now return to our motivating example, the global connection game. By definition, 
this game requires that the cost of any built edge be shared equally among all players 
using that edge. This sharing rule is natural, arguably fair, and as we have seen, implies 
a number of nice properties. But is this really the best possible sharing rule? Could 
perhaps another sharing rule induce even better outcomes? We can view this question 
as a problem of mechanism design, although here we use the term more broadly than in 
Chapter 9; instead of seeking to elicit “truthful” behavior, we simply want to guarantee 
that stable outcomes exist and are reasonably efficient. 

If we want to design games to induce better outcomes, we must first decide to what 
extent we will allow ourselves, as mechanism designers, to alter the game. After all, 
suppose that we define a game in which players receive a large penalty for taking any 
path that does not conform with a particular optimal solution. Such a game has pure 
equilibria, and the price of anarchy is trivially 1. But intuitively, this is not a satisfying 
solution; this game is too restrictive and fails to capture the decentralized spirit of our 
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earlier network formation games. Therefore, our first hurdle is to specify the class of 
“reasonable” games that are open for consideration. 

To this end, Chen et al. (2006) introduce the class of cost-sharing games. This class 
includes the global connection game, as well as similar games with other cost-sharing 
tules. A cost-sharing game is played on a graph with edge costs and terminals s;, t; for 
each player i. A strategy for player i is an s; — t; path. Given a strategy vector S, cost 
shares are assigned to players on an edge-by-edge basis as specified by a cost-sharing 
method &, for each edge e. In particular, if S, is the set of players whose path includes 
e under S, then é,(i, S,) => 0 is cost share assigned to i for e. The total cost incurred by 
player i is the sum of i’s cost shares. We require that any cost-sharing method satisfy 
two basic properties: 


e Fairness: For all i, e we have (i, 8S.) = Oif i ¢ So. 
* Budget-balance: For all e we have )>; &-(i, Se) = Ce. 


A cost-sharing scheme specifies a cost-sharing method per edge given a network, 
a set of players, and a strategy vector. This definition allows cost-sharing schemes to 
make use of global information, and thus we also consider the special case of oblivious 
cost-sharing schemes, in which cost-sharing methods depend only on c, and S.. Note 
that the Shapley network formation game is an oblivious cost-sharing game, with the 
cost-sharing method &,(i, S.) = ce/|Se| fori € Se. 

We now return to our question regarding the relative efficiency of the Shapley 
scheme. In particular, we will show that nonoblivious cost-sharing schemes can provide 
far better guarantees than the Shapley scheme. 


Theorem 19.15 For any undirected network in which all players seek to reach 
a common sink, there is a nonoblivious cost-sharing scheme for which the price 
of anarchy is at most 2. 


PROOF We define anonoblivious cost-sharing scheme for which players at equi- 
librium may be viewed as having simulated Prim’s MST heuristic for approxi- 
mating a min cost Steiner tree. Since this heuristic is 2-approximation algorithm, 
such a scheme suffices. More concretely, if t is the common sink, we order players 
as follows. Let player 1 be a player whose source s, is closest to ¢, let player 2 
be a player whose source 5 is closest to {t, s;}, and so on. Define a cost-sharing 
method that assigns the full cost of e to the player in S, with the smallest index. 
Since player 1 pays fully for her path regardless of the other players’ choices, at 
equilibrium player 1 must choose a shortest path from s, to t, and inductively, the 
remaining players effectively simulate Prim’s algorithm as well. 


On the other hand, if we restrict our attention to oblivious schemes, Chen, Rough- 
garden, and Valiant prove that for general networks, we cannot do better than the 
Shapley cost-sharing scheme in the worst case. More precisely, they argue that any 
oblivious cost-sharing scheme either fails to guarantee the existence of pure equilibria 
or has a price of stability that is at least 7; for some game. Thus we have an answer 
to our original question; while there may be nonoblivious schemes that perform better 
than Shapley cost-sharing, no oblivious scheme offers a smaller price of stability in 
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the worst case. See the notes on this chapter (Section 19.5.2) for a brief discussion of 
research concerning other cost-sharing approaches. 


19.4 Facility Location 


In the models we have considered so far, players construct networks so as to achieve 
certain connectivity-based goals. Intuitively, these goals are meant to capture players’ 
desires to provide service for some implicit population of network users. Given this per- 
spective, we might then ask what happens when we instead view players as financially 
motivated agents; after all, service providers are primarily concerned with maximizing 
profits, and only maintain networks for this purpose. This suggests a model in which 
players not only build networks but also charge for usage, while network users spur 
competition by seeking the cheapest service available. 

We will consider here a pricing game introduced by Vetta (2002) that is based on the 
facility location problem. In the facility location problem, we want to locate k facilities, 
such as Web servers or warehouses, so as to serve a set of clients profitably. Our focus 
here will be to understand the effect of selfish pricing on the overall efficiency of the 
networks that players form. 

We first present Vetta’s competitive facility location problem, in which players place 
facilities so as to maximize their own profit. We then show that this facility location 
game is a potential game, and prove that the price of anarchy for an even broader class 
of games is small. 


19.4.1 The Model 


Suppose that we have a set of users that need a service, and k service providers. We 
assume that each service provider i has a set of possible locations A; where he can 
locate his facility. 

Define A = U;A; to be the set of all possible facility locations. For each location 
s; € A; there is an associated cost cjs, for serving customer j from location s;. We 
can think of these costs as associated with edges of a bipartite graph that has all users 
on one side and all of A on the other, as shown on Figure 19.4. A strategy vector 
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Figure 19.4. The bipartite graph of possible locations and clients. Selected facilities are marked 
in black. 
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S = {5,,..., 5;} can be thought of as inducing a subgraph of this graph consisting of 
the customers and the selected location nodes (marked as black on Figure 19.4). 

Our goal is to maximize social welfare, rather than simply minimizing the cost of 
the constructed network. We assume that customer j has a value 7; for service, and 
gathers 2; — p benefit by receiving service at a price p < mj. Locating a facility s; is 
free, but that service provider i must pay cjs, to serve client j from location s;. Doing 
so generates a profit of p — cj,,. If provider i services customer j from location s;, 
then this arrangement creates a social value (or surplus) of 2; — cjs,, the value 7; of 
service minus the cost c js, at which the service is provided. Note that this social surplus 
is independent of the price p = p;; charged; varying p;; simply redistributes welfare 
between the customer and the provider. We define the social welfare V(S) to be the 
total social value over all providers and customers. 

To simplify notation, we assume that zr; > cjs, for all j,i, and s; € A;. To see that 
this requires no loss of generality, note that decreasing cj;,, to be at most 2; does not 
change the value of any assignment: when 7; < cjs, customer j cannot be served from 
location s;, while 2; = cjs, allows us to serve customer j from location s; at cost. In 
either case, the assignment of serving client j from facility s; results in 0 social value. 

To complete the game, we must specify how prices are set and assignments are 
determined. Given a strategy vector s, we assume that each customer is assigned to 
a facility that can serve for the lowest cost. The price p;; charged to a customer j 
using player i’s facility s; is the cost of the second cheapest connection available to 
j,1.€., min;z; Cjs'. Intuitively, this is the highest price 7 could expect to get away with 
charging j; charging any more would give some player i’ an incentive to undercut i. 

Indeed, we can construct an equivalent interpretation of this game in which prices are 
selected strategically. Consider a three-stage game where both providers and customers 
are strategic agents. In the first stage, providers select facility locations. In the second 
stage, providers set prices for users. And, in the last stage, users select a provider for 
service, and pay the specified price. 

As we saw in Chapter 1, subgame perfect equilibrium is a natural solution concept for 
multistage games. We will use here a further refinement of this concept, the trembling 
hand perfect equilibrium for extensive form games (see Mas-Colell et al., 1995). 
Assume that with probability « > 0, each player picks a strategy chosen uniformly at 
random, and chooses a best strategy with the remaining (1 — €) probability. We use the 
notion of subgame perfect equilibrium for this €-perturbed game. A trembling hand 
perfect equilibrium is an equilibrium that can be reached as the limit of equilibria in 
the €-perturbed game as € approaches 0. This stronger notion of stability is required 
to prevent providers from offering unprofitably low prices and thereby forcing other 
providers to artificially lower their own prices. 


19.4.2 Facility Location as a Potential Game 


We start by proving that the facility location game is a potential game. 


Theorem 19.16 The facility location game is a potential game with social value 
V(s) as the potential function. 


504 NETWORK FORMATION GAMES 


PROOF We necd to argue that if a provider i changes her selected location, then 
the change in social welfare V(s) is exactly the change in the provider’s welfare. 
To show this, we imagine provider i choosing to “drop out of the game” and show 
that the change in social welfare V(s) is exactly i’s profit. 

If provider i “drops out,” each client j that was served by provider i switches 
over to his second best choice. Recall that p;; is exactly the cost of this choice. 
Thus the client will be served at cost p;; rather than cj;,, so the increase in cost is 
Pij — Cjs;, exactly the profit provider i gathers from j. 

To prove the statement about provider i changing his strategy, we can think 
of the change in two steps: first the provider leaves the game, and then reenters 
with a different strategy. The change in social welfare is the difference between 
the profit of provider i in the two strategies. 


Corollary 19.17 There exists a pure strategy equilibrium, and furthermore, all 
efficient outcomes of the facility location game are stable. Thus, the price of 
stability is 1. Finally, best response dynamics converge to an equilibrium, but this 
equilibrium may not be socially optimal. 


Our next goal is to prove that the price of anarchy for this facility location game is 
small. However, it turns out that the proof applies to a much broader class of games, 
which we present now. 


19.4.3 Utility Games 


Vetta (2002) introduced the facility location game as one example of a large class 
of games called utility games. In a utility game, each player i has a set of available 
strategies A;, which we will think of as locations, and we define A = U;A;. A social 
welfare function V (S) is defined for all S C A. Observe that welfare is purely a function 
of the selected locations, as is the case with the facility location game. In defining the 
socially optimum set, we will consider only sets that contain one location from each 
strategy set A;. However, various structural properties of the function V(S) will be 
assumed for all S C A. For a strategy vector s, we continue to use V(s) as before, and 
let a;(s) denote the welfare of player i. A game defined in this manner is said to be a 
utility game if it satisfies the following three properties. 


(i) VS) is submodular: for any sets S C S’ Cc A and any element s € A, we have V(S + 
s) — V(S) => V(S’ +s) — V(S"). In the context of the facility location game, this states 
that the marginal benefit to social welfare of adding a new facility diminishes as more 
facilities are added. 

(ii) The total value for the players is less than or equal to the total social value: }* a;(s) < 
V(s). 
(iii) The value for a player is at least his added value for the society: a;(s) > V(s) — V(s — 


Sj). 


A utility game is basic if property (iii) is satisfied with equality, and monotone if for 
all S CS’ C A, V(S) < V(S’). 
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To view the facility location game as a utility game, we consider only the providers 
as players. We note that the social welfare V(S) = >> jt j — MiNges Cjq) is indeed 
purely a function of the selected locations. 


Theorem 19.18 The facility location problem is a monotone basic utility game. 


PROOF Property (ii) is satisfied essentially by definition, and we used the equal- 
ity of property (ili) property in proving Theorem 19.16. To show property (1), 
notice that adding a new facility decreases the cost of serving some of the clients. 
The magnitude of this decrease can only become smaller if the clients are already 
choosing from a richer set of facilities. Finally, adding a facility cannot cause 
the cost of serving a client to increase, and thus the facility location game is 
monotone. 


19.4.4 The Price of Anarchy for Utility Games 


Since the facility location game is a potential game with the social welfare as the 
potential function, the price of stability is 1. In fact, this applies for any basic utility 
game (any utility game with a;(s) = V(s) — V(s — s;) for all strategy vectors s and 
players 7). Unfortunately, the increased generality of utility games comes at a cost; 
these games are not necessarily potential games, and indeed, pure equilibria do not 
always exist. However, we now show that for monotone utility games that do possess 
pure equilibria (such as the facility location game), the price of anarchy is at most 2. 


Theorem 19.19 = For all monotone utility games the social welfare of any pure 
Nash equilibrium is at least half the maximum possible social welfare. 


PROOF Let S be the set of facilities selected at an equilibrium, and O be the set 
of facilities in a socially optimal outcome. We first note that V(O) < V(S U O) 
by monotonicity. Let O' denote the strategies selected by the first i players in the 
socially optimal solution. That is, O° = #, O! = {o0,},..., OF = O. Now 
V(O) — V(S) < V(SU O)— V(S) = S“IV(SU 0’) — V(SU OO}. 
i=0 


By submodularity (property (i)) 
ViSuo=VSU 0 <= VS +o, = 5) = VS = 5) 


for all i. Using property (iii), we can further bound this by a;(S + 0; — s;). Since 
S is an equilibrium, a;(S + 0; — s;) < a;(S). Together these yield 


ViO}= Vie ViOUSs) = VS) = >| a(S). 
Finally, property (ii) implies that }°; a(S) < V(S),so V(O) < 2V(S), and hence 
the price of anarchy is at most 2. 
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19.4.5 Bounding Solution Quality without Reaching an Equilibrium 


For any monotone basic utility game, one can also bound the quality of the solution 
without assuming that players reach an equilibrium, as was shown in a sequence of 
two papers by Mirrokni and Vetta (2004) and Goemans et al. (2005). 


Theorem 19.20 Consider an arbitrary solution in a monotone basic utility 
game. Suppose that at each time step, we select a player at random and make a 
best response move for that player. For any constant € > 0 the expected social 
value of the solution after O(n) such moves is at least 1/2 — € times the maximum 
possible social value. 


PROOF Let S be a state, and O be an socially optimal strategy vector. We 
will prove that the expected increase in social welfare in one step is at least 
1(V(O) — 2V(S)), which implies the claimed bound after O(n) steps. 

Let £; be the maximum possible increase in the value for player i. Thus the 
expected increase in value is 1 >>; Bi. Selecting strategy 0; is an available move, 
so B; > a;(S — s; + 0;) — a;(S), and by basicness, B; > V(S — s; + 0;) — VCS 
sj) — a(S). 

The rest of the proof mirrors the price of anarchy proof above. We have 


VCO) = VS) Ss SY IVs si + 0;) — VCS — 5;)] 
i=0 
as before. We bound V(S + 0; — s;) — V(S — s;) < a;(S) + 6;. Using this with 
property (ii) yields 


V(O) — V(S) < Y“(ai(S) + Bi) < VIS) + > Bi 


Thus >°; 6; > V(O) — 2V(S), and the expected increase in V(S) is 1(V(O) — 
2V(S)). The difference V(O) — 2V(S) is expected to decrease by a factor of 
di — 2) each step. After n/2 steps, the difference is expected to decrease by a 


factor of e, and after log(e~!)n steps shrinks to an € factor. 


19.5 Notes 


19.5.1 Local Connection Game 


Network formation games have a long history in the social sciences, starting with the 
work of Myerson (1977, 1991). A standard example of such games can be found in 
Jackson and Wolinsky (1996) (see Jackson (2006) for a more comprehensive survey). 
These network formation games are often used to model the creation of social networks, 
and aim to capture pairwise relations between individuals who may locally form direct 
links to one another. In other contexts, these games might model peering relations 


! The constant in the O(.) notation depends on log e~!. 
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between pairs of Autonomous Systems (Johari et al., 2006; Fabrikant et al., 2003), or 
bilateral contracts between systems as with P2P overlay networks (Chun et al., 2004; 
Christin et al., 2004). Most network formation games in the economics literature use 
a bilateral process, in which edges only form between two agents with the consent of 
both parties, unlike the unilateral process of Section 19.2. 

Jackson and Wolinsky (1996) examine the trade-off between efficient and stable 
networks by studying various network formation games and specifying the conditions 
under which some (or all) stable outcomes are socially efficient, as done in Section 
19.2.2. Section 19.2.3 explores how efficient nonoptimal stable outcomes may be. 

Corbo and Parkes (2005) study a bilateral variant of local connection game. In the 
bilateral network formation game, two nodes must pay the a cost to form a connecting 
edge. Thus edges represent bilateral agreements in which players agree to evenly 
share the edge cost (which is effectively 2~). This contrasts with the unilateral edge 
formation used in the local connection game. Otherwise, the games are the same; 
players have the same strategy sets, and evaluate the resulting network in the same 
manner. 

Nash equilibria do not appear to be well-suited for modeling bilateral agreements; 
for a graph to be stable, we need only ensure that no node wants to drop edges, 
since a player cannot singlehandedly add an edge. For example, the empty graph is 
always a Nash equilibrium in the bilateral game, and hence the price of anarchy is very 
high. 

Jackson and Wolinsky (1996) suggest using the notion of pairwise stable equilib- 
rium; no user u wants to drop any adjacent edge e = (u, v), and no pair of users u and 
v wants to add the connecting edge (u, v). This stability concept is closely related to 
a variant of Nash equilibrium in which we allow coalitions of two players to deviate 
together (u and v may drop any subset of edges adjacent to them, and possibly add the 
edge (u, v) connecting them, if this is beneficial to both players). This is the solution 
concept used in the stable matching problem (see Chapter 10), where the natural devi- 
ation for a matching that is not stable is by a “blocking pair’: a man and a woman who 
prefer each other to their current partners. 

The optimal network structure is the same as in the unilateral game with edge cost 
2a. The proof of Theorem 19.1 can be modified to show that when a > 1, the star is 
pairwise stable, and when a < 1 the complete graph is pairwise stable. Note that in 
both cases, these networks are also efficient, so the price of stability is 1. One can also 
extend the bounds of Lemma 19.4 and Theorems 19.5 and 19.6 to bound the quality of 
a worst pairwise stable equilibrium (see Exercise 19.8). 

Andelman et al. (2007) consider the effect of coalitions in the unilateral game. Recall 
from Chapter | that a strong Nash equilibrium is one where no coalition has a joint 
deviation that is profitable for all members. Andelman et al. show that when a € (1, 2), 
there is no stable network resisting deviations by coalitions of size 3, and also that 
when @ > 2, all strong Nash equilibria have cost at most twice the optimum, 1.e., the 
strong price of anarchy is at most 2. 

There are many other natural and relevant variations to the discussed network 
formation games. One important aspect of the model suggested by Jackson (2006) and 
Jackson and Wolinsky (1996), is that nodes are not required to reach all other nodes 
in the network. Instead, node u has a value w,, for connecting to another node v, and 
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this benefit decays exponentially with distance. In this game, pairwise stable equilibria 
may not be connected. 

Chun et al. (2004) introduce a variant of the unilateral network formation game 
to model overlay networks. They allow the cost incurred by node wu for adding a 
directed edge (u, v) to depend upon v and the degree of v, thereby modeling some 
congestion effects. The authors also extended the notion of distance beyond hop-count, 
and consider restricting the set of possible connections available to each player. Using 
Nash equilibria as their solution concept, they study the quantitative trade-offs between 
cost, node degree, and path length in an experimental setting. Christin et al. (2004) also 
use these models, and argue that using approximate (rather than exact) equilibria can 
improve the predictive power of the model and accommodate small errors in modeling 
and decision making. 

Johari et al. (2006) introduced a related game for modeling bilateral contracts 
between systems. In this game players form directed edges to carry traffic, and the 
payments along the links are negotiated, in that players can make offers and demands. 
Anshelevich et al. (2006) propose a variant of this model with fixed routing that 
includes both directed links and symmetric peering links, and show that in this model, 
there exists an efficient solution that is approximately stable in some sense. 


Open Problems 


We have given a bound of O(./q@) for the price of anarchy of the local connection 
game, and improved this bound to O(1) for small a. Also, Albers et al. (2006) proved 
an O(1) bound for the case w > 12n logn (see also Exercise 19.7 for the case a > n7). 
It is an open problem whether a constant bound holds for all values of a. 

The local connection game is an extremely simple model of network formation. It 
would be valuable to understand to what extent the price of anarchy bounds apply to 
broader classes of games. Albers et al. (2006) extend the price of anarchy bound to 
games where traffic (which affects distance costs multiplicatively) is not uniform, but 
edge costs remain uniformly a as before. Unfortunately, these bounds depend on the 
traffic weights, and are only O(1) when these weights are relatively small (n7wWmax < @). 
Is there a natural characterization of all traffic patterns that support a constant price of 
anarchy? And, can the price of anarchy results extend to models where edge costs vary 
over the network? 

As mentioned, Christin et al. (2004) argue that approximate equilibria are better 
models of natural network formation. Can we extend our price of anarchy bounds to 
approximate equilibria? 

So far we have been concerned with the quality of equilibria, and did not consider 
the network formation process. Does “natural” game play of these local connection 
games converge to an equilibrium efficiently? Bala and Goyal (2000) show that in their 
model, game play does converge to an equilibrium in some cases. Is this also true in 
broader class of games? In cases when natural game play does not converge, or only 
converges slowly, can one bound the quality of the solution after a “long enough” game 
play, as we have seen in Section 19.4.5? 

The network formation process of Bala and Goyal (2000) is very uniform, and 
leads to networks with extremely simple structure (such as a cycle, star or wheel). 
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Newman (2003) and Jackson and Rogers (2006) introduce more complex network- 
formation process based on a random graph generation process that results in graphs 
that have a number of real-world network properties, such as the power-law degree 
distribution, clustering, etc. Unfortunately, this process is exogenous, and not really 
based on personal incentives. One exciting open challenge is to develop an incentive- 
based and endogenous model of network formation that generates more heterogenous 
and realistic networks. 


19.5.2 Potential Games and a Global Connection Game 


The global connection game is related to a large body of work on cost-sharing (see 
Feigenbaum et al., 2001; Herzog et al., 1997; and the references therein). Much of this 
work is not game-theoretic; the network is typically assumed to be fixed, and the goal 
is to compute cost shares with certain properties. Chapter 15 considers cost sharing in 
a game-theoretic context by assuming the existence of a central authority who must 
compute cost shares for nodes, each of which has a private utility for inclusion in the 
network. Thus, the focus is on developing a cost sharing mechanism that induces nodes 
to reveal their true valuations. 

Our general results for potential games suggest some natural extentions to the global 
connection game. For example, if we consider the global connection game played on 
undirected networks, then W(S) is still a potential function. Thus we again have that 
pure equilibria exist and the price of stability is at most 7/;. We can also generalize the 
global connection game by allowing players to have more than two terminals they wish 
to connect. In such a game, players would select trees spanning their terminals rather 
than paths. Again, it is easily verified that Y(S) is a potential function, so the same 
results apply. Furthermore, we assumed that the cost of each edge c, is independent of 
the number of users. Consider the case when the cost c.(k,) of the edge e depends on the 
number of players (k,) that use the edge e. The same analysis also extends to this version, 
assuming the function c,(k,) is concave, that is, the cost exhibits an “economy of scale” 
property; adding a new user is cheaper when a larger population is using the edge. 

Anshelevich et al. (2003) consider an unrestricted variant of the global connection 
game. In this game, players select not only a path but also cost shares for each edge on 
that path. If the combined shares for an edge cover its cost, that edge is built. Players 
are assumed to be unhappy if their path is not fully built, and otherwise aim to minimize 
their cost shares. This game does not necessarily have pure equilibria, and even when 
it does, even the price of stability may be O(k). However, in the special case of single 
source games (all players seek connection to a common terminal), the price of stability 
is shown to be 1. 

Chen and Roughgarden (2006) study a weighted generalization of the global con- 
nection game, in which each player has a weight, and costs shares are assigned in 
proportion to these weights. This turns out not to be a potential game, and further, the 
authors provide an instance in which no pure equilibrium exists. This paper focuses 
on finding outcomes that are both approximate equilibria and close to optimal. An- 
other similar weighted game is presented by Libman and Orda (2001), with a different 
mechanism for distributing costs among users. They do not consider the quality of 
equilibria, but instead study convergence in parallel networks. 
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Milchtaich (1996) considers a generalization of congestion games in which each 
player has her own payoff function. Equilibria are shown to exist in some class of these 
games even though a potential function may not. 


Open Problems 


Recall the network shown in Figure 19.3(b), which shows that the price of stability may 
H, for the global connection game. Note, however, that if the edges are undirected, then 
the price of stability falls to 1. The actual worst-case price of anarchy for undirected 
graphs remains an open question. 

There are a wide variety of cost-sharing schemes, as defined by Chen and 
Roughgarden (2006), that might be relevant either for practical reasons (such as be- 
ing more fair), or because they induce better outcomes for certain specific classes of 
networks. Many such schemes, including weighted fair sharing, do not yield exact 
potential games. For a large number of these cost-sharing games, the price of anarchy, 
the price of stability, and even the existence of pure equilibria remain unresolved. 

More generally, the class of games we consider aims to model situations where users 
are building a global shared network and care about global properties of the network 
they build. Our focus was on requiring connectivity (of a terminal set) and aiming 
to minimize cost. More generally, it would be valuable to understand which type of 
utility measures yield games with good price of stability properties. For example, we 
might consider users who are allowed to leave some terminals unconnected, or who 
care about other properties of the resulting network, such as distances, congestion, etc. 

Potential functions are an important tool in understanding the price of anarchy and 
stability in games. A recent survey of Roughgarden (2006) shows that one can also un- 
derstand the price of anarchy analysis of resource allocation problems (see Chapter 21) 
via the potential function method. Surprisingly, many of the price of anarchy and 
stability results known to date are for potential games (and their weighted variants); 
the routing games of Chapter 18, the facility location game of Section 19.4, and the 
load balancing problems of Chapter 20. In a number of these cases, the analysis of the 
price of anarchy or stability uses alternative techniques to derive stronger bounds than 
could have been obtained using the potential function method (e.g., bounding the price 
of anarchy with multiple equilibria, or analyzing weighted variants of these games). 
However, one wonders if potential functions still play a role here that we do not fully 
understand. 


19.5.3 Facility Location Game 


There is a large body of literature dedicated to understanding the effects of pricing 
in games. Much of this work focuses on establishing the existence of equilibria, and 
considering qualitative properties of equilibria (such as whether improved service leads 
to improved profit, or if selfish pricing leads to socially efficient outcomes). 

Our focus with the facility location game is to understand the effect of selfish 
pricing on the overall efficiency of a network. In many settings, selfish pricing leads 
to a significant reduction in social welfare, and may also yield models with no pure 
equilibria. An example of this issue is the pricing game of Example 8 in Chapter 1. 
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See also Chapter 22 for a discussion of these issues in the context of communication 
networks. 

Our price of anarchy bound requires that social welfare be monotone in the set of 
facilities selected. It is natural to try to extend this game to a scenario in which facilities 
cost money: in addition to paying the service cost cj;, for servicing a client j from a 
facility s;, the provider also pays an installation cost f(s;) for building at s;. Unfortu- 
nately, there is no constant bound for the price of anarchy for this case. See Exercise 
19.17, which observes that when investment costs are large, noncooperative players do 
not always make the right investments, and thus equilibria may be far from optimal. 

Utility games defined in Section 19.4.3 have a wide range of applications, including 
routing (Vetta, 2002) (see Exercise 19.18), and a market sharing game introduced by 
Goemans et al. (2006) in the context of content distribution in ad-hoc networks (see 
Exercises 19.16 for a special case). 

In Section 19.4 we bounded the price of anarchy only for pure equilibria. Recall, 
however, that general utility games may not have pure equilibria. Theorem 19.19 
bounding the quality of equilibria also holds for mixed equilibria (Vetta, 2002) and 
thus is applicable in a much broader context. 

Section 19.4.5 showed that in basic utility games, we can bound the quality of 
solutions without reaching an equilibrium. Such bounds would be even more valuable 
for general utility games, as these might not have any pure equilibria. Goemans et al. 
(2005) provide such bounds for a few other games, including some routing games. 
Unfortunately, the quality of a solution in a general utility game can be very low even 
after infinitely long game play, as shown by Mirrokni and Vetta (2004). 


Open Problems 


Many pricing games fail to have Nash equilibria (Example 8 from Chapter 1) and 
others have equilibria with very low social value (high price of anarchy and stability). 
The facility location games give a class of examples where pure Nash equilibria exist, 
and the price of anarchy is small. It would be great to understand which other classes 
of pricing games share these features. 

It will also be extremely important to understand which other classes of games admit 
good-quality bounds after limited game play, as shown in Section 19.4.5 for facility 
location games. 
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Exercises 


19.1 Consider the local connection game from Section 19.2. In Lemma 19.1, we saw 
that the star is an optimal solution for w > 2, and the complete graph is an optimal 
solution for a < 2. Prove that if a # 2, then these are in fact the only optimal 
solutions. 

19.2 Give a complete characterization of all optimal networks for a = 2. 

19.3 Show that when a < 1 the complete graph is the only equilibrium. 


19.4 Show that a sufficiently long path cannot be a Nash equilibrium of the local 
connection game from Section 19.2. 


19.5 Show that any path can be a pairwise stable network for a large enough value of 
a in the bilateral network formation game introduced in Section 19.5.1. 


19.6 Construct a Nash equilibrium that is not a star for w > 2. 


19.7. Show that when @ > n? all Nash equilibria of the local connection game are trees 
and the price of anarchy is bounded by a constant. 


19.8 Prove that the bounds of Lemma 19.4 and Theorems 19.5 and 19.6 are also valid 
for the worst possible quality of a pairwise stable equilibria of the bilateral version 
of the game (where an edge needs to be selected, and paid for by both endpoints 
to be included in GC). 


19.9 Prove that in the global connection game, the price of anarchy can never exceed 
k, the number of players. 
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19.10 


19.11 


19.12 


NETWORK FORMATION GAMES 


Consider the following weighted generalization of the global connection game. 
For each player /, we have a weight w; > 0. As before, each player selects a single 
path connecting her source and sink. But instead of sharing edge costs equally, 
players are now assigned cost shares in proportion to their weight. In particular, 
for a strategy vector S and edge e, let 5. denote those players whose path contains 
e, and let We = Dies, wi be the total weight of these players. Then player i pays 
Cow; / We for each edge e € P;. Note that if all players have the same weight, this 
is the original game. Show that, in general, this game does not have an exact 
potential function. 


In the global network formation game, edge costs reflect fixed building expenses, 
and thus each player’s share for an edge e decreases as more players use e. 
We might also consider a model with the opposite behavior, i.e., a model in 
which the cost of using e increases with the number of players. This would be 
more appropriate for modeling latency or similar effects that make congestion 
undesirable. 

Consider a game played on a network G with k players. Player i has a source 
s; and a sink t;. Each edge e € G also has a nondecreasing latency function £¢(x), 
indicating the cost incurred by each player on e if there are x of these players. A 
strategy for i is a path from s; to t, and choosing a path P; incurs a total cost of 


cost(P;) = Se Lelke), 


eéP; 


where k, is the number of players using e. 


(a) Prove that this game has an exact potential function. 
(b) Suppose that we also give each player / an integral weight w; > 1. 

A strategy for i is a multiset S; of w paths from s; to t;. Notice that we 
do not insist that these paths be disjoint, or even distinct. Costs are now 
assigned in a natural way; we first compute the cost that each individual path 
would be charged if each corresponded to a distinct player. Then each player 
i is charged the sum of the costs of all paths in S;. Prove that if the latency 
functions £.(x) are linear for all e, then this game has an exact potential 
function. 

(c) Show that if €.(x) is not linear, then there may not be an exact potential 
function. 


One problem with using best response dynamics to find pure equilibria in poten- 
tial games such as the global connection game is that the running time may be 
exponential. One natural way to deal with this problem is to run best response 
dynamics, but to consider only moves that provide a substantial decrease in the 
potential function. In particular, for a constant € > 0, we say a best response 
move is substantial if it decreases the potential function by at least an €/k fraction 
of its current value. We consider the process of making substantial best response 
moves until none are available. 


(a) Prove that this process terminates in time that is polynomial in n, k, and 
log(e~'). 

(b) Show that the resulting outcome is not necessarily an approximate equilib- 
rium. That is, show that there may be players who can decrease their costs by 
an arbitrarily large factor. 


19.13 


19.14 


19.15 
19.16 


19.17 


19.18 
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Suppose that we have a game G and a function ®(S) mapping game states to 
reals with the following property: for any strategy vectors S = (51, 52,..., Sk), 
and any alternate strategy S/ 4 S; for some player i, then if S’ = (S_;, S/), we have 
that &(S) — ®(S’) and u;(S’) — u;(S) share the same sign. Thus ®(S) behaves like 
an exact potential function, except instead of tracking a player’s improvement 
exactly, it simply tracks the direction of the improvement; when a player makes 
an improving move, the potential function decreases. We call such a function an 
ordinal potential function, and G an ordinal potential game. 


(a) Prove that if G is an ordinal potential game, then best response dynamics 
always converge to a Nash equilibrium. 

(b) Prove that the converse is also true; if, from any starting configuration, best 
response dynamics always converge to an equilibrium, then G is an ordinal 
potential game. 


Give an example of the global connection game for which the best Nash equi- 
librium does not minimize the potential function W. 


Prove that any congestion game is an exact potential game. 


Consider the following location game. We have an unweighted, undirected net- 
work G and k players. Each player selects a node in G as their location. Each 
node v has one unit of wealth that it uniformly distributes to all players in N[v], 
the closed neighborhood of v. If there are no players in N[v], this wealth is lost. 
For example, if v has neighbors u and x, 2 players locate at v, 3 players locate 
at u, and no one locates at x, then v awards 1/5 to each of these 5 players. The 
utility of a player is simply the sum of the value awarded to it by all nodes. We 
define the social utility of this game as the number of nodes that have at least one 
player located in their closed neighborhood. 


(a) Prove that the price of anarchy of this game can be arbitrarily close to 2. 
(b) Prove that this location game is a valid utility game. 


In theorem 19.19 we showed that if the facilities cost 0, then the social welfare 
of any Nash equilibrium is at least 1/2 of the maximum possible social welfare of 
any solution. In this problem, we consider a variant where facilities cost money; 
each possibly facility s; has a cost f(s;), to be paid by a player who locates a 
facility at s;. 


(a) Is the same bound on the quality of a Nash equilibrium also true for the variant 
of this game that facilities cost money? Prove or give an example where it is 
not true. 

(b) Let F denote the total facility cost of ata Nash equilibrium 5S, i.e., the sum 
ses fs; Show that we can bound the optimum V(O) by 2V(S) + F. 


We now consider a variant of the selfish routing game of Chapter 18 with k players. 
We have a graph G and a delay function (x) that is monotone increasing and 
convex for each edge e € E. Player i has a source s; and a destination ¢, and must 
select an s; — t; path P; on which to route 1 unit of traffic. Player / will tolerate 
up to a; delay. Player i picks a path from s; to ¢ with minimum delay, or no path 
at all if this delay exceeds dj. 


(a) Show that this game always has a pure (deterministic) Nash equilibrium. 
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(b) The traditional way to evaluate such routing games is with the sum of all 
delays as cost. However, in this version, the cost may be low simply because 
few players get routed. Thus we can instead consider the value gathered by 
each player; d; minus the delay incurred if i does route her traffic, and 0 
if she doesn’t. By definition, all players routed have nonnegative value. The 
total value of a solution is simply the sum of player values. Show that this is 
a utility game. 

(c) Is this game a monotone utility game? 


CHAPTER 20 


Selfish Load Balancing 


Berthold Vécking 


Abstract 


Suppose that a set of weighted tasks shall be assigned to a set of machines with possibly different 
speeds such that the load is distributed evenly among the machines. In computer science, this problem 
is traditionally treated as an optimization problem. One of the classical objectives is to minimize the 
makespan, 1.e., the maximum load over all machines. Here we study a natural game theoretic variant 
of this problem: We assume that the tasks are managed by selfish agents, i.e., each task has an agent 
that aims at placing the task on the machine with smallest load. We study the Nash equilibria of this 
game and compare them with optimal solutions with respect to the makespan. The ratio between the 
worst-case makespan in a Nash equilibrium and the optimal makespan is called the price of anarchy. 
In this chapter, we study the price of anarchy for such load balancing games in four different variants, 
and we investigate the complexity of computing equilibria. 


20.1 Introduction 


The problem of load balancing is fundamental to networks and distributed systems. 
Whenever a set of tasks should be executed on a set of resources, one needs to balance 
the load among the resources in order to exploit the available resources efficiently. 
Often also fairness aspects have to be taken into account. Load balancing has been 
studied extensively and in many variants. One of the most fundamental load balancing 
problems is makespan scheduling on uniformly related machines. This problem is 
defined by m machines with speeds s,,..., 8, and n tasks with weights w,,..., Wp. 
Let [n] = {1, ..., 2} denote the set of tasks and [m] = {1, ..., m} the set of machines. 
One seeks for an assignment A : [n] — [m] of the tasks to the machines that is as 
balanced as possible. The /oad of machine j € [m] under assignment A is defined as 


The makespan is defined to be the maximum load over all machines. The objective is 
to minimize the makespan. If all machines have the same speed, then the problem is 
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known as makespan scheduling on identical machines, in which case we shall assume 
Sp==e-=S, = 1. 

In computer science, load balancing is traditionally viewed as an algorithmic prob- 
lem. We design and analyze algorithms, either centralized or distributed, that compute 
the mapping A. Suppose, however, there is no global authority that can enforce an 
efficient mapping of the tasks to the machines. For example, in a typical Internet appli- 
cation, tasks might correspond to requests for downloading large files that users send 
to servers. To maximize the quality of service, each of the users aims at contacting a 
server with smallest load. This naturally leads to the following game theoretic setting 
in which we will be able to analyze what happens to the makespan if there is no global 
authority but selfish users aiming at maximizing their individual benefit decide about 
the assignment of tasks to machines. 

This chapter differs from the other chapters in Part III of this book in two important 
aspects. At first, the considered objective function, the makespan, is nonutilitarian. At 
second, our analysis does not only consider pure but also mixed equilibria. By using 
the makespan as objective function, our analysis simultaneously captures the aspects of 
efficiency and fairness. By considering mixed equilibria, our analysis explicitly takes 
into account the effects of uncoordinated random behavior. 


20.1.1 Load Balancing Games 


We identify agents and tasks, i.e., task i € [nm] is managed by agent 7. Each agent can 
place its task on one of the machines. In other words, the set of pure strategies for an 
agent is [m]. A combination of pure strategies, one for each task, yields an assignment 
A: [n] — [m]. We assume that the cost of agent i under the assignment A corresponds 
to the load on machine A(1Z), i.e., its cost is £4). The social cost of an assignment is 
denoted cost(A) and is defined to be the makespan, i.e., cost(A) = max je{m| (¢ if). 

Agents may use mixed strategies, 1.e., probability distributions on the set of pure 
strategies. Let p/ denote the probability that agent i assigns its task to machine j, i-e., 
p} = P[A(i) = J]. A strategy profile P = (p} )ietn), jetm| Specifies the probabilities for 
all agents and all machines. Clearly, every strategy profile P induces a random mapping 
A. For i € [n], j € [m], let x be a random variable that takes the value 1 if A(i) = j 
and 0, otherwise. The expected load of machine j under the strategy profile P is 
thus 


E[¢;] = E es iy, UE iy ae 
J 


ie[n] Sy ie[n] y ie[n] 


The social cost of a strategy profile P is defined as the expected makespan, i.e., 


cost(P) = E[cost(A)] = El max (¢,) | ; 


Jeln] 


We assume that every agent aims at minimizing its expected cost. From point of view 
of agent i, the expected cost on machine j, denoted by c/, is c/ = E[€; | A@) = J]. 
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For any profile P, 


i wi + j Wk pi ; ' 
c= - ! = Eei+d — ph). (20.1) 
J J 


In general, a strategy profile of a game is a Nash equilibrium if there is no incentive 
for any agent to unilaterally change its strategy. For the load balancing game, such a 
profile is characterized by the property that every agent assigns positive probabilities 
only to those machines that minimize its expected cost. This is formalized as follows. 


Proposition 20.1 A strategy profile P is a Nash equilibrium if and only if 


Vi €[n]: Vj €[m]: p; > 0 > Vk E[m]:c} < ct. 


The existence of a Nash equilibrium in mixed strategies is guaranteed by the theorem 
of Nash, see Chapters 1 and 2. A strategy profile P is called pure if, for each agent, 
there exists only one machine with positive probability. A Nash equilibrium in pure 
strategies is called a pure Nash equilibrium. Applying the proposition above to pure 
profiles and the corresponding assignments yields the following characterization of a 
pure Nash equilibrium. 


Proposition 20.2 An assignment A is a pure Nash equilibrium if and only if 
Vi € [n] : Vk € [m]: c2 < ct. 


In words, an assignment is a pure Nash equilibrium if and only if no agent can 
improve its cost by unilaterally moving its task to another machine. A special property 
of load balancing games is that they always admit pure Nash equilibria. 


Proposition 20.3 Every instance of the load balancing game admits at least 
one pure Nash equilibrium. 


PROOF An assignment A induces a sorted load vector (A,,..., 4m), Where A; 
denotes the load on the machine that has the j-th highest load. If an assignment is 
not a Nash equilibrium, then there exists an agent i that can perform an improve- 
ment step, i.e., it can decrease its cost by moving its task to another machine. 
We show that the sorted load vector obtained after performing an improvement 
step is lexicographically smaller than the one preceding it. Hence, a pure Nash 
equilibrium is reached after a finite number of improvement steps. 

Suppose, given any sorted load vector (A,,..., Am), agent i performs an im- 
provement step and moves its task from machine j to machine k where the indices 
are with respect to the positions of the machines in the sorted load vector. Clearly, 
k > j. The improvement step decreases the load on machine j and it increases the 
load on machine k. However, the increased load on machine k is smaller than A; 
as, otherwise, agent i would not decrease its cost. Hence, the number of machines 
with load at least 1; is decreasing. Furthermore, the loads on all other machines 
with load at least 1; are left unchanged. Consequently, the improvement step 
yields a sorted load vector i.e. lexicographically smaller than (Aj, ..., Am). 
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Thus improvement steps naturally lead to a pure Nash equilibrium. This issue is 
also discussed for a broader class of games, so-called potential games, in Chapter 19. 
Let us remark that this convergence result implies that there exists even a pure Nash 
equilibrium that minimizes the makespan. Given any optimal assignment, such an 
equilibrium can be found by performing improvement steps until a Nash equilibrium 
is reached because improvement steps do not increase the makespan. Thus, for load 
balancing games with social cost equal to the makespan, it does not make much sense 
to study the ratio between the social cost in a best Nash equilibrium and the optimal 
social cost. This ratio is called the “price of stability.” It is studied in Chapters 17-19 
in the context of other games. In this chapter, we are mainly interested in the ratio 
between the social cost of the worst Nash equilibrium and the optimal social cost, the 
so-called the “price of anarchy.” 


20.1.2 Example of a Load Balancing Game 


Suppose that there are two identical machines both of which have speed 1 and four tasks, 
two small tasks of weight 1 and two large tasks of weight 2. An optimal assignment 
would map a small and a large task to each of the machines so that the load on both 
machines is 3. This assignment is illustrated in Figure 20.1(a). 

Now consider an assignment A that maps the two large tasks to the first machine and 
the two small tasks to the second machine as illustrated in Figure 20.1(b). This way, the 
first machine has a load of 4 and the second machine has a load of 2. Obviously, a small 
task cannot improve its cost by moving from the second to the first machine. A large 
task cannot improve its cost by moving from the first to the second machine either as its 
cost would remain 4 if it does. Thus assignment A constitutes a pure Nash equilibrium 
with cost(A) = 4. Observe that all assignments that yield a larger makespan than 4 
cannot be a Nash equilibrium as, in this case, one of the machines has a load of at least 
5 and the other has a load of at most 1 so that moving any task from the former to the 
latter would decrease the cost of this task. Thus, for this instance of the load balancing 
game, the social cost of the worst pure Nash equilibrium is 4. 

Clearly, the worst mixed equilibrium cannot be better than the worst pure equilibrium 
as the set of mixed equilibria is a superset of the set of pure equilibria, but can it really 
be worse? Suppose that each task is assigned to each of the machines with probability 


(a) (b) 


Figure 20.1. (a) Illustration of the optimal assignment of an instance of the load balancing 
game with two large tasks of size 2 and two small tasks of size 1 as described in the example 
given in Section 20.1.2. The social cost of this assignment is 3. (b) Illustration of a pure Nash 
equilibrium for the same instance. The social cost of this assignment is 4, which is the maximum 
among all pure Nash equilibria for this instance. 
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5. This corresponds to a strategy profile P with pi = 5 forl <i <4,1 <j <2.The 
expected load on machine j is thus 


1 1 
E[é;] = , pi =2-2--42-1-—==3. 
[el= 2) wie; er 
1<i<4 
It is important to notice that the expected cost of a task on a machine is larger than 
the expected load of the machine, unless the task is assigned with probability 1 to this 
machine. For example, if we assume that task 1 is a large task then Equation 20.1 yields 


1 
cf = El] + — pw =3 45-254, 


and, if task 3 is a small task, then 


1 
= El] + (1 — p3)w3 =3+ 5-1 = 3.5. 


For symmetry reasons, the expected cost of each task under the considered strategy 
profile P is the same on both machines so that P is a Nash equilibrium. The social cost of 
this Nash equilibrium, cost(P), is defined to be the expected makespan, E[cost(A)], of 
the random assignment A induced by P. The makespan, cost(A), is a random variable. 
This variable can possibly take one of the four values 3, 4, 5, or 6. There are 2* = 16 
different assignments of four tasks to two machines. The number of assignments that 
yield a makespan of 3 is 4, 4 is 6, 5 is 4, and 6 is 2. Consequently, the social cost of the 
mixed Nash equilibrium is 


1 
cost(P) = E[cost(A)] = 16 (3-4+4-64+5-4+6-2) =4.25. 


Thus mixed equilibria can, in fact, be worse than the worst pure equilibrium. 


20.1.3 Definition of the Price of Anarchy 


Not surprisingly, the example above shows that uncoordinated, selfish behavior can 
lead to suboptimal assignments. We are interested in the ratio between the social cost 
(makespan) of a worst-case Nash equilibrium, i.e., the Nash equilibrium with highest 
social cost, and the social cost of an optimal assignment. 


Definition 20.4 (Price of anarchy) For m € N, let G(m) denote the set of all 
instances of load balancing games with m machines. For G € G(m), let Nash(G) 
denote the set of all strategy profiles being a Nash equilibrium for G, and let 
opt(G) denote the minimum social cost over all assignments. Then the price of 
anarchy is defined by 


cost(P) 
PoA(m) = max ma : 
GeG(m) PeNash(G) opt(G) 


In the following, we study the price of anarchy in load balancing games in four 
different variants in which we distinguish, as a first criterion, between games with 
identical and uniformly related machines and, as a second criterion, between pure 
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Nash equilibria and mixed Nash equilibria. Technically, when considering the price of 
anarchy for load balancing games with identical machines then we restrict the set G(m) 
to instances in which the m machines have all the same speed. When considering the 
price of anarchy with respect to pure equilibria then the set Nash(G) refers only to pure 
Nash equilibria rather than mixed equilibria; i.e., we take the maximum only among 
pure equilibrium assignments rather than among possibly mixed equilibrium strategy 
profiles. 

The motivation behind studying the price of anarchy is to quantify the increase of 
the social cost due to selfish behavior. With this motivation in mind, does it make 
more sense to consider pure or mixed equilibria? If one wants to study a distributed 
system in which agents repeatedly perform improvement steps until they reach a Nash 
equilibrium, then pure equilibria are the right solution concept. However, there might 
be other means by which agents come to a Nash equilibrium. In particular, if one views 
load balancing games as one shot games, then mixed equilibria are a very reasonable 
solution concept. Moreover, upper bounds about the price of anarchy for mixed equi- 
libria are more robust than upper bounds for pure equilibria as mixed equilibria are 
more general than pure ones. In this chapter, we consider both of these equilibrium 
concepts. Our analysis begins with the study of pure equilibria as they are usually 
easier to handle than mixed equilibria whose analysis requires a bit of probability 
theory. 


20.2 Pure Equilibria for Identical Machines 


Our analysis of equilibria in load balancing games begins with the most basic case, 
namely the study of pure equilibria on identical machines. Our first topic is the price 
of anarchy. As a second topic, we investigate how long it takes until a pure Nash equi- 
librium is reached when the agents repeatedly perform “best response” improvement 
steps. 


20.2.1 The Price of Anarchy 


In case of pure equilibria and identical machines, the analysis of the price of anarchy 
is quite similar to the well-known analysis of the greedy load balancing algorithm that 
assigns the tasks one after the other in arbitrary order giving each task to the least 
loaded machine. Graham (1966) has shown that the approximation factor of the greedy 
algorithm is 2 — -. We show that the price of anarchy for pure equilibria is, in fact, 
slightly better than the approximation factor of the greedy algorithm. 


Theorem 20.5 = Consider an instance G of the load balancing game with n tasks 
of weight w,,..., Wy, and m identical machines. Let A : [n] > [m] denote any 
Nash equilibrium assignment. Then, it holds that 


2 
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PROOF Let j* be a machine with the highest load under assignment A, and 
let i* be a task of smallest weight assigned to this machine. Without loss of 
generality, there are at least two tasks assigned to machine j* as, otherwise, 
cost(A) = opt(G) so that the upper bound given in the theorem follows trivially. 
Thus w;« < 4 cost(A). 

Suppose there is a machine j ¢€ [n] \ {j*} with load less than ¢;- — w;+. Then 
moving the task i* from j* to 7 would decrease the cost for this task. Hence, as 
A is a Nash equilibrium, it holds 


1 1 
£; = lj» — wy = cost(A) — 5 cost(A) = 5 cost(A). 


Now observe that the cost of an optimal assignment cannot be smaller than the 
average load over all machines so that 


ar Wi 
m 


= > jetm ej 


m 
cost(A) + 5 cost(A)(m — 1) 
> 


opt(G) > 


m 
Bs (m + 1)cost(A) 
- 2m : 
AS a consequence, 


2m 
cost(A) < 
m 


2 
ze - opt(G) = (2 Fo ad :) - opt(G). 


Observe that the example of a game instance with two identical machines given 
in Section 20.1.2 has a price of anarchy of ; =2- a, for m = 2. Exercise 20.2 
generalizes this example. It shows that, for every m € N, there exists an instance G 
of the load balancing game with m identical machines and 2m tasks that has a Nash 
equilibrium assignment A : [n] — [m] with 


2 
cost(A) = (2 — —) - opt(G). 


Thus the upper bound on the price of anarchy given in Theorem 20.5 is tight. 


20.2.2 Convergence Time of Best Responses 


Our analysis about the price of anarchy leaves open the question of how agents may find 
or compute a Nash equilibrium efficiently. In the existence proof for pure equilibria 
in Proposition 20.3, we have implicitly shown that every sequence of improvement 
steps by the agents leads to a Nash equilibrium. However, if players do not converge to 
an equilibrium in reasonable time, then it might also not matter if the finally reached 
equilibrium is good. This naturally leads to the question of how many improvement 
steps are needed to reach a Nash equilibrium. The following result shows that, in case 
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of identical machines, there is a short sequence of improvement steps that leads from 
any given initial assignment to a pure Nash equilibrium. An agent is said to be satisfied 
if it cannot reduce its cost by unilaterally moving its task to another machine. The max- 
weight best response policy activates the agents one after the other always activating 
an agent with maximum weight among the unsatisfied agents. An activated agent plays 
a best response; i.e., the agent moves its task to the machine with minimum load. 


Theorem 20.6 Let A: [n] — [m] denote any assignment of n tasks to m iden- 
tical machines. Starting from A, the max-weight best response policy reaches a 
pure Nash equilibrium after each agent was activated at most once. 


PROOF We claim, once an agent i € [n] was activated and played its best re- 
sponse, it never gets unsatisfied again. This claim immediately implies the the- 
orem. Our analysis starts with two observations both of which holding only for 
identical machines. At first, we observe that an agent is satisfied if and only if its 
task is placed on a machine on which the load due to the other tasks is minimal. 
At second, we observe that a best response never decreases the minimum load 
among the machines. As a consequence, a satisfied agent can get unsatisfied only 
for one reason: the load on the machine holding its task increases because another 
agent moves its task to the same machine. Suppose that agent k is activated after 
agent 7, and it moves its task to the machine holding task i. Let j* denote the 
machine on which i is placed and to which k is moved. For j € [m], let £; denote 
the load on machine /j at the time immediately after the best response of agent k. 
Since the assignment of k to j* is a best response and as w; < w; because of the 
max-weight policy, it follows 


Ej S Lj + we S bj + wi, 


for all j € [m]. Hence, after the best response of k, agent i remains satisfied 
on machine j* as it cannot reduce its cost by moving from j* to any other 
machine. 


Let us remark that the order in which the agents are activated is crucial. For example, 
if one would always activate an agent of minimum weight among the unsatisfied agents, 
then there are instances of load balancing games on identical machines where one needs 
an exponential number of best response steps to reach a pure Nash equilibrium (Even- 
Dar et al., 2003). 


20.3 Pure Equilibria for Uniformly Related Machines 


We now switch from identical to uniformly related machines. First, we study the price 
of anarchy. Then we discuss the complexity of computing equilibria. 


20.3.1 The Price of Anarchy 


The analysis in Section 20.2.1 shows that, in case of identical machines, the makespan 
of a pure Nash equilibrium is less than twice the optimal makespan. In this section, we 
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(a) (b) 


q 
| 
Cc 1 k+1 ae | macs 
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k-1-- = 
c-3 . 
a Lay = Li. 24 
eo L.-2 ss k+1 k 
3 bas = 
Figure 20.2. (a) Illustration of the definition of the lists L-1, L-2,..., Lo from the proof of 


Theorem 20.7. (b) Illustration of the lists L, and Ly; and the machine q used in the proof of 
Lemma 20.8. 


show that the makespan of pure equilibria on uniformly related machines can deviate 
by more than a constant factor. The price of anarchy is bounded, however, by a slowly 
growing function in the number of machines. Our analysis begins with an upper bound 
on the price of anarchy followed by the presentation of a family of game instances that 
match this upper bound up to a small constant factor. 


Theorem 20.7 = Consider an instance G of the load balancing game with n tasks 
of weight w,,...,W, and m machines of speed s,,..., 5m. Let A: [n] > [m] 
denote any Nash equilibrium assignment. Then, it holds that 


logm 


cost(A) = O ( ) -opt(G). 


log log m 


PROOF Let c = |cost(A)/opt(G)|. We show c < I~!(m), where P~! denotes 
the inverse of the gamma function, an extension of the factorial function with 
the property that '(k) = (k — 1)!, for every positive integer k. This yields the 
theorem as 


4 _ logm 
I (m) = © | ———— }. 
log log m 


Without loss of generality, let us assume s; > 52 >--- > Sm, and let L = 
[1,2,...,m] denote the list of machines in nonincreasing order of speed. For 
k € {0,...,c — l}, let Ly denote the maximum length prefix of L such that the 
load of each server in L; is at least k- opt(G). Figure 20.2(a) illustrates this 
definition. We will show the following recurrence on the lengths of these lists. 


[Lal = K+ 1): [Legal (O0<{k<ce—2) 
|Le-1| > 1 
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Solving the recurrence yields |Lo| > (c — 1)! = I'(c). Now observe that Ly = L 
and, hence, |Lo| = m. Consequently, m > I'(c) so that c < '~!(m), which proves 
the theorem. 

It remains to prove the recurrence. We first prove |Z._;| > 1. For the purpose 
of a contradiction, assume that the list L,_; is empty. Then the load of machine 
1 is less than (c — 1) - opt(G) in the equilibrium assignment A. Let i be a task 
placed ona machine j with load at least c - opt(G). Moving i to machine 1 reduces 
the cost of 7 to strictly less than 


(c— 1) - opt(G) + — < (c — 1) - opt(G) + opt(G) < c- opt(G), 
1 


where the inequality a < opt(G) follows from the fact that s; is the speed of the 
fastest machine. Consequently, agent i is able to unilaterally decrease its cost by 
moving its task from machine j to machine 1, which contradicts the assumption 
that A is a Nash equilibrium. Thus, we have shown that |Z,_;| > 1. 

Next, we show |LZ,| > (kK + 1)- |Ley1|, for0 < k < c — 2. Let A* bean optimal 
assignment, i.e., an assignment whose makespan is equal to opt(G). The following 
lemma relates the placement of tasks in the equilibrium assignment A to the 
placement of tasks in the optimal assignment A*. 


Lemma 20.8 Suppose i is a task with A(i) € Lyy1. Then A*(i) € Lg. 


PROOF If L\ Ly =¥@ then this claim follows trivially. Let g be the smallest 
index in L \ Lx, i.e., machine q is one of the machines with maximum speed 
among the machines L \ Lx. By the definition of the group L;, the load of q is 
less than k - opt(G), ie., £, < k - opt(G). Figure 20.2(b) illustrates the situation. 

By the definition of the groups, A(i) € Ly41 implies €4q) > (k + 1) - opt(G). 
For the purpose of a contraction, assume w; < s, - opt(G). Then moving task i to 
machine g would reduce the cost of i to 


Wi 
lg + eo k - opt(G) + opt(G) < lai, 
q 
which contradicts the assumption that A is a Nash equilibrium. Hence, every 
task i with A(i) € Ly+1 satisfies w; > sz - opt(G). Now, for the purpose of a 
contradiction, suppose A*(i) = j and j € L \ Lx. Then the load on j under A* 
would be at least 


Wi Sq Opt(G 
~> pat 5 opt(G) 
oe) Sj 
because s; < sy. However, this contradicts that A* is an optimal assignment. 
Consequently, A*(i) € Lx. 


By the definition of Lx+1, the sum of the weights that A assigns to a machine 
J € Lye41 is at least (k + 1) - opt(G) - s;. Hence, the total weight assigned to the 
machines in L,z+; is at least era & + 1)- opt(G)-s;. By Lemma 20.8 an 
optimal assignment has to assign all this weight to the machines in L, such that 
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the load on each of these machines is at most opt(G). As a consequence, 


Y= (K+ 1)- opt(G) +s; < S° opt(G) - sj. 


JEL E+ JEL 
Dividing by opt(G) and subtracting )> eee from both sides yields 
Deesss Do 5. 
JEL JEL \ Livi 


Now let s* denote the speed of the slowest machine in L;+1, Le., s* = syz,,,|. For 
all j € Lisi, s; = s*, and, for all j € Ly \ Ley, 5; < s*. Hence, we obtain 


* * 
) kes" < ) se 
JELE+1 JEL \ Lev 


which implies |Lx+41|-k < [Le \ Lisil = |Lel — |Leqil. Thus, |Lx| > (A + 1)- 
|Lx41|. This completes the proof of Theorem 20.7. 


We now prove a lower bound showing that the upper bound on the price of anarchy 
given in Theorem 20.7 is essentially tight. 


Theorem 20.9 For every m €N, there exists an instance G of the load bal- 
ancing game with m machines and n < m tasks that has a Nash equilibrium 
assignment A : [n] + [m] with 


logm 


cost(A) = Q ( ) -opt(G). 


log log m 


PROOF Recall the definition of the gamma function from the proof of Theo- 
rem 20.7. We describe a game instance G together with an equilibrium assignment 
A satisfying 


- (Fm) — 2 — o(1)) - opt(G), 


Nie 


cost(A) > 


which yields the theorem. 
Our construction uses g + | disjoint groups of machines denoted Go, ..., Gg 
with g ~ ~!(m). More, precisely, we set 


q = (P'(m/3) — 1] = T'(m) — 2 - o(). 


For 0 < k < q, group Gx consists of g!/k! machines of speed 2" each of which 
is assigned k tasks of weight 2". Let us remark that 0! = 1. The total number of 
machines in these groups is thus 


q q 
1 
SIG =a! a <3T(q+l)<m 
k=0 k=0 

because )-/_ A < 3 and 3P'(g + 1) < m, which follows directly from the defi- 
nition of g. As m might be larger than the number of the machines in the groups, 
there might be some machines that do not belong to any of the groups. We assume 
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that these machines have the same parameters as the machines in group Go; ie., 
they have speed 2° = 1 and A does not assign a tasks to them. 

We need to show that the described assignment is a Nash equilibrium. An agent 
with a task on a machine from group G;, has cost k. It can neither reduce its cost 
by moving its task to a machine in group G,; with j > k as these machines have 
at least a load of k, nor can it reduce its cost by moving its task to a machine in 
group G; with j < k as the load on such a machine, after the task moved to this 
machine, would be 

2s 
[opt ee he tea DER 
since 2' > t+ 1, for every t > 1. Hence, none of the agents can unilaterally 
decrease its cost. In other words, A is a Nash equilibrium. 

The social cost of the equilibrium assignment A is g. Next we show that 
opt(G) < 2 so that the theorem follows. We construct an assignment with load at 
most 2 on every machine. For each k € {1,...,q}, the tasks mapped by A to the 
machines in group G, are now assigned to the machines in group G;_. Observe 
that the total number of tasks that A maps to the machines in G, is 


! ! 
k-|Gyl =k at 7; = 1Ge-11- 


kL 6k-D 

Hence, we can assign the tasks in such a way that each machine in group Gx_1 
receives exactly one of the tasks that A mapped to a machine in group Gx. This 
task has a weight of 2* and the speed of the machine is 2‘~'. Hence, the load of 
each machine in this assignment is at most 2, which completes the proof. 


20.3.2 Algorithms for Computing Pure Equilibria 


The proof of Proposition 20.3 reveals that, starting from any initial assignment, a pure 
Nash equilibrium is reached after a finite number of improvement steps. Theorem 20.6 
shows that there exists a sequence of improvement steps of length O(n) in case of 
identical machines and this sequence can be computed efficiently. However, in the case 
of uniformly related machines, it is not known whether there always exists a short se- 
quence of improvement steps and whether such a sequence can be efficiently computed 
like in the case of identical machines. However, the well-known LPT (largest process- 
ing time) scheduling algorithm allows us to efficiently compute a Nash equilibrium. 
This algorithm inserts the tasks in a nonincreasing order of weights, assigning each 
task to a machine that minimizes the cost of the task at its insertion time. 


Theorem 20.10 The LPT algorithm computes a pure Nash equilibrium for load 
balancing games on uniformly related machines. 


PROOF Let the tasks be numbered from | to n in the order of their insertion. 
Let time t € {0,...,} denote the point of time after the first t tasks have been 
inserted. We show by an induction that the partial assignment A : [t] > [m] 
computed by LPT at time ¢ is a Nash equilibrium. By our induction assumption 
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the tasks 1,...,f— 1 are satisfied at time t — 1; i.e., none of these tasks can 
improve its cost by a unilateral deviation. When task ¢ is inserted, it might be 
mapped to a machine j* € [m] that holds already some other tasks. We only have 
to show that these tasks do not get unsatisfied because of the increased load on 
j* because of the assignment of task t. Let i < t be one of the tasks mapped to 
machine j*. For j € [m], let £; denote the load on machine j at time ¢. Since the 
assignment of task tf to machine j* minimizes the cost of agent t and as w, < wj;, 


Cj Cp tw, . by +; 
J < J t < J t 
hy Sj Sj 


’ 


for all 7 € [m]. Hence, also at time ¢, agent i is satisfied on machine j* as it 
cannot reduce its cost by moving from j* to another machine. 


The assignment computed by the LPT algorithm is not only a Nash equilibrium but 
it also approximates the optimal makespan within a ratio of at most 3 for uniformly 
related machines and $ — + for identical machines, see Friesen (1987) and Graham 
(1966), respectively. As makespan scheduling is NP-hard even on identical machines, 
one cannot hope for an efficient algorithm that computes an assignment with optimal 
makespan, unless P 4 NP. However, the polynomial time approximation scheme of 
Hochbaum and Shmoys (1988) computes an assignment of tasks to uniformly related 
machines minimizing the makespan within a ratio of (1 + €), for any given € > 0. This 
assignment is not necessarily a Nash equilibrium. Feldmann et al. (2003a) present an 
efficient algorithm that transforms any given assignment into an equilibrium assignment 
without increasing the makespan. This approach is called Nashification. Combining 
the polynomial time approximation scheme with the Nashification approach yields a 
polynomial time algorithm that computes an equilibrium assignment for scheduling on 
uniformly related machines minimizing the makespan within a factor of (1 + €), for 


any given € > 0. 


20.4 Mixed Equilibria on Identical Machines 


The example with two identical machines presented in Section 20.1.2 shows that the 
social cost can increase if players make use of randomization. Let us now study this 
effect systematically. We analyze by how much the price of anarchy is increased 
when the set of strategies is extended from pure to mixed strategies. First, we con- 
sider an extreme case of randomization in which every agent randomizes over all 
strategies. 


20.4.1 Fully Mixed Equilibria 


The support of an agent is the set of strategies to which the agent assigns positive 
probability. In a fully mixed strategy profile all pure strategies are in the support of 
every agent. There is exactly one fully mixed strategy profile for load balancing games 
on identical machines i.e. a Nash equilibrium. In this fully mixed Nash equilibrium every 
player assigns every task with probability a to each of the machines, i.e., P = (p/) with 


— am for every i € [n] and j € [m]. The fully mixed Nash equilibrium maximizes 
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the randomization and, hence, seems to be a good candidate to study the effects of 
randomization. 

Our analysis begins with a particularly simple class of load balancing games: Sup- 
pose that we have not only identical machines but also identical tasks. That is, we 
assume that there are m machines of speed 1 and n tasks of weight 1. In the unique 
fully mixed Nash equilibrium for such a game, each task is assigned to each machine 
with probability + This strategy profile is a Nash equilibrium as the expected cost c/ 
of any task i on any machine j is the same. In particular, Equation 20.1 yields 


ea 1 1 
c/ =Efe]+(1 fe, 
m m 


This setup corresponds to a well-studied balls-and-bins experiment from probability 
theory in which n balls are assigned independently, uniformly at random to m bins, 
which is also discussed in Chapter 17. How bad is such a fully mixed Nash equilibrium 
in comparison to an optimal assignment that distributes the tasks evenly among the ma- 
chines? An optimal assignment minimizes the makespan, and the optimal makespan is 
obviously [~]. The expected makespan of the fully mixed strategy profile corresponds 
to the expected maximum occupancy of the corresponding balls-and-bins experiment, 
Le., the expected number of balls in the fullest bin. The following proposition yields a 
simple formula for this quantity that is exact up to constant factors for any choice of m 
and n. 


Proposition 20.11 Suppose that n > 1 balls are placed independently, uni- 
formly at random into m > 1 bins. Then the expected maximum occupancy is 


e Inm 
In (1 + ” In m) 


Let us illustrate the formula for the expected maximum occupancy given in the 
proposition with a few examples. If n > mlogm, then the expected maximum occu- 
pancy is @() as, in this case, In(1+ “=Inm) = O(#Inm). If n < m'~, for any 
fixed € > 0, then the expected maximum occupancy is © (1). Observe, in both of 
these cases, the ratio between the expected makespan for the fully mixed equilib- 
rium and the makespan of an optimal assignment is O(1). It turns out that this ratio 
is maximized when setting m =n. In this case, the expected maximum occupancy 
is © (log m/loglogm) while the optimal makespan is 1. This yields the following 
result. 


Theorem 20.12 For everym €N, there exists an instance G of aload balancing 
game with m identical machines and n = m tasks that has a Nash equilibrium 
strategy profile P with 


logm 


cost(P) = Q ( ) -opt(G). 


log logm 
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As the fully mixed Nash equilibrium is the equilibrium that maximizes the ran- 
domization, one could guess that this is also the equilibrium that maximizes the ratio 
between the expected makespan and the optimal makespan for load balancing games. 
This guess is known as the so-called fully mixed Nash equilibrium conjecture. This 
conjecture is appealing as it would yield a simple characterization of the worst-case 
Nash equilibrium for load balancing games. Unfortunately, however, the conjecture is 
wrong. With the help of Proposition 20.11, we can easily construct a counterexample. 
Let m = 2*, for some k € N. This way, ,/m as well as log m are integers. Now consider 
the following instance of the load balancing game on m identical machines. Suppose 
that there are ./m large tasks of weight 1, and (m — ,/m) - log m small tasks of weight 
ie ,- Lhe balls-and-bins analysis above shows that the maximum number of large tasks 
that are assigned to the same machine by a fully mixed Nash equilibrium is O(1), and 
the maximum number of small tasks assigned to the same machine is O(log m). Hence, 
the expected makespan of the fully mixed Nash equilibrium is O(1). Now consider 
the following strategy profile: Assign the large tasks uniformly at random to the first 
,/m machines (called group A) and the small tasks uniformly at random to the other 
machines (called group B). This profile is a Nash equilibrium as Equation 20.1 yields 
that, for a large task, the expected cost on a machine of group A is less than the expected 
cost on a machine of group B and, for a small task, the expected cost on a machine of 
group B is less than the expected cost on a machine of group A. In this equilibrium, 
the expected maximum occupancy among the large tasks is Ogee 7 ), which shows 
that there is a mixed Nash equilibrium whose expected makespan is larger than the 


expected makespan of the fully mixed Nash equilibrium by a factor of Qt om ). 


20.4.2 Price of Anarchy 


The fully mixed Nash equilibrium is not necessarily the worst-case Nash equilibrium 
for every instance of the load balancing game on identical machines. Nevertheless, the 
following analysis shows that the lower bound on the price of anarchy that we obtained 
from studying this kind of equilibria is tight. 


Theorem 20.13 Consider an instance G of the load balancing game with n 
tasks of weight w,,..., W, and m identical machines. Let P = (P} ictal, jetnl 
denote any Nash equilibrium strategy profile. Then, it holds that 


logm 


cost(P) = O ( ) -opt(G). 


log log m 


PROOF Without loss of generality, we assume that all machines have speed 1. 
Recall that cost(P) = E[max jefmj(¢;)], i.e., cost(P) corresponds to the expected 
maximum load over all machines or, in other words, the expected makespan. 
Our analysis starts with proving an upper bound on the maximum expected load 
instead of the expected maximum load. 

We claim that, for every j € [m], E[¢€;] < 2 — +) opt(G). The proof for 
this claim follows the course of the analysis for the upper bound on the price of 
anarchy for pure equilibria. More specifically, the proof of Theorem 20.5 can be 
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adapted as follows to mixed equilibria: Instead of considering a smallest weight 
task i* placed on a maximum load machine j*, one defines i* to be the smallest 
weight task with positive probability on a machine j* maximizing the expected 
load. Also in all other occurrences one considers the expected load instead of the 
load. 

We conclude that the maximum expected load is less than 2 opt(G). Next we 
show that the expected maximum load deviates at most by a factor of OGsee aa) 
from the maximum expected load. We use a weighted Chernoff bound in order to 
show that it is unlikely that there is a machine that deviates by a large factor from 


its expectation. 


Lemma 20.14 (weighted Chernoff bound) Let X,,..., Xj be independent 
random variables with values in the interval [0,z] for some z > 0, and let 
X= baie X;, then for any t it holds that PEs X; >t) <(e-E[X]/ ty. 


A description how to derive this and other variants of the Chernoff bound can be 
found, e.g., in Mitzenmacher and Upfal (2005). 

Fix j € [m]. Let w denote the largest weight of any task. Applying the weighted 
Chernoff bound shows that, for every f, 


Ee.” er t/opt(G) 
= 0 saint, (=S2) | = (2220) | 


because E[€;] < 2 0pt(G) and w < opt(G). Now let t = 2 opt(G) 2 iin Lens 
for any x > 0, 


elninm 


2Inm/InInm+x/opt(G) 
Pe; >T +x]< < ) 


lA 


2Inm/InInm 
-) ee) 


ere) 


(om 
2 


where the second inequality holds asymptotically as, for sufficiently large m, 
sam > > ./logm and a > e. 

Now with the help of the tail bound we can upper-bound cost(P) as follows. For 
every nonnegative random variable X, E[X] = hee P[X > t]dt. Consequently, 


cost(P) = E| max &| = [Phx e > far. 


Jelm] 


Substituting t by t + x and then applying the union bound yields 


cost(P) < r+ fi ! Pl max ¢; >r+a|dx < c+ > Ple; =e +x] dx. 
0 


Of See jem] 
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Finally, we apply the tail bound derived above and obtain 


CO 
cost(P) < r+ e V/A, = ++ opt(G), 
0 


Inm 
InInm* 


which yields the theorem as t = 2 opt(G) 


20.5 Mixed Equilibria on Uniformly Related Machines 


Finally, we come to the most general case, namely mixed equilibria on uniformly related 
machines. The following theorem shows that the price of anarchy for this case is only 
slightly larger than the one for mixed equilibria on identical machines or pure equilibria 
on uniformly related machines. The analysis combines the methods from both of these 
more restricted cases: First, we show that the maximum expected makespan is bounded 


by 
logm 
—-_ } . opt 
= () oe 


using the same kind of arguments as in the analysis of the price of anarchy for pure 
equilibria on uniformly related machines. Then, as in the case of mixed equilibria on 
identical machines, we use a Chernoff bound to show that the expected maximum load 
is not much larger than the maximum expected load. In fact, this last step loses only a 
factor of order log log m/ log log log m, which results in an upper bound on the price 


of anarchy of 
1 
Ope), 
log log log m 


After proving this upper bound, we present a corresponding lower bound by adding 
some randomization to the lower bound construction for pure equilibria on uni- 
formly related machines, which increases also the lower bound by a factor of order 
log log m/ log log log m and, hence, yields a tight result about the price of anarchy. 


Theorem 20.15 = Consider an instance G of the load balancing game with n 
tasks of weight w,,...,W, and m machines of speed s,,..., Sm. Let P be any 
Nash equilibrium strategy profile. Then, it holds that 


logm 


cost(P) = O ( ) - opt(G). 


log log logm 


PROOF As in the case of identical machines, our analysis starts with proving an 
upper bound on the maximum expected load instead of the expected maximum 
load. To simplify the notation, we assume opt(G) = 1, which can be achieved by 
scaling the weights appropriately. Let c = | max jel] ( Ee jl)]. We first prove an 
upper bound on c following the analysis for pure Nash equilibria in Theorem 20.7. 
Without loss of generality, assume s; > s2 >--- > s». Let L=[1,2,...,m] 
denote the list of machines in non increasing order of speed. For k € {0,...,¢— 
1}, let L,; denote the maximum length prefix of L such that the expected load 
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of each server in L; is at least k. Analogously to the analysis in the proof of 
Theorem 20.7, one shows the recurrence |L;| > (kK + 1)- |Ly44|,forO <k <c— 
2, and |L,_;| => 1. Solving the recurrence yields |Lo| > (c — 1)! = I'(c). Thus, 
|Lo| =m implies c < [~!(m) = © (Inm/InInm). Now let 


Inm Inm 
C=maxjc+1, =0 ; 
InInm InInm 


In the rest of the proof, we show that the expected makespan of the equilibrium 
assignment can exceed C at most by a factor of order InInm/InInInm so that 
the expected makespan is O(Inm/InInInm), which proves the theorem as we 
assume opt(G) = 1. 

As the next step, we prove a tail bound on ¢;, for any fixed j € [m] and, after- 
ward, we use this tail bound to derive an upper bound on the expected makespan. 
For a machine j € [m], let i denote the set of tasks i with p/ > ; and ha 


the set of tasks i with p} € (0, +). Let Bg and oS denote random variables that 


describe the load on link 7 only taking into account the tasks in em and To 


respectively. Observe that ¢; = oP + ee For the tasks in jee we immediately 
obtain 


(1) Wi Wi p} 77 9 (L) 
a2 > a. > —— =4E[e;1 = 4€. (20.2) 
ier ep 
To prove an upper bound on Le, we use the weighted Chernoff bound from 
Lemma 20.14. This bound requires an upper bound on the maximum weight. 
As a first step to bound the weights, we prove a result about the relationship 
between the speeds of the machines in the different groups that are defined by 
the prefixes. For 0 < k <c—2, let G, = Ly \ Leys, and let G._; = L,_. For 
0 <k<c—1, let s(k) denote the speed of the fastest machine in Gx. Clearly, 
s(c— 1) > s(e — 2) > --- > s(1) = 5(0). We claim that this sequence is, in fact, 
geometrically decreasing. 


Lemma 20.16 For0<k<c—4,s(k+2) > 25(k). 


PROOF To prove the claim, we first observe that there exists a task j* with 
Wj» < s(k + 2) that has positive probability on a machine in L;+3. This is because 
an optimal assignment strategy has to move some of the expected load from the 
machines in Lx43 to machines in L \ Lxz43 and it can only assign those tasks 
to machines in L \ Lx43 whose weights are not larger than the maximum speed 
among this set of machines, which is s(k + 2). Now suppose s(k) > $3(k + 2). 
The expected load of the fastest machine in Gz; = Ly \ Lx+1 is at most k +1. 
Thus the expected cost of j* on the fastest machine in G; is at most 


Wj* 


" 5(k) 


2w j» 
t < 
s(k +2) ~ 


— 


<k+l1 k+3. 
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This contradicts that the expected cost of j* in the considered Nash equilib- 
rium is at least k + 3 as it has positive probability on a machine in L;,+3. Thus, 
Lemma 20.16 is shown. 


Now we apply Lemma 20.16 to prove an upper bound on the weights of the 
tasks in the set Te 


Lemma 20.17 For every j € [m] andi € ie wi < 1258;. 


PROOF Leti beatask from TPAD) € (0, +). Let j € G,,for0<k<c—1. 
The expected cost of i on 7 is 


} = E(e)]+ (1 }) > ks 
“i lé;14 Pi Sj ' As; 


Suppose that k > c — 3. In this case, w; > 12.5; implies c! >k+ 3 -12>c+6, 
which contradicts that, under the Nash equilibrium profile, the expected cost of 
any task on the fastest machine is at most c + 1. Hence, the lemma is shown for 
k > c — 3. Now suppose k < c — 4. Let g denote the fastest machine from G;+2. 
Lemma 20.16 yields s, = s(k + 2) > 2s(k) > 2.s;. Hence, the expected cost of i 
on q is 


Wi 


cf = E[é,] + (1 — p#) kos 


Sq 28; 


J 


As a > 0, the Nash equilibrium condition yields c/ < c/. Consequently, 


uc eg are aay 
As; 28; 


which implies w; < 12.5; and, hence, completes the proof of Lemma 20.17. 


Let z = max,_,@(w;/s;). Lemma 20.17 implies z < 12. Now applying the 
: 
weighted Chernoff bound from Lemma 20.14 yields that, for every a > 0, 


ac/z 
e-E[e 2) aC/12 
Ple;” > aC] < (Se < (<) 


aC a 


since ale] <C. We define t = 24C InInm/InInInm. As C is of order 
Inm/InInm, it follows that t is of order Inm/InInInm. Let x > 0. We sub- 
stitute t + x for aC and obtain 


ec \etH/i2 
THX 

elnInInm 2C InInm/InInInm+x/12 
< | —— . 
( 24 InInm ) 


Pe? >t+x)< ( 
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Observe that 24InInm/(elnInInm) is lower-bounded by /InInm and also 
lower-bounded by e”. Furthermore, C > Inm/InInm. Applying these bounds 
yields 


: eo t/6 1. eo */6, 


2Inm/InInInm 
) = ae 


Pee >t+xj)< —— 
ninm 


As a consequence, 
CO 
=| max ¢)” | = / P| max ¢?” > t|ar 
jelm] / 0 jel] / 


T +f P[ max so >t+ xJax 
0 


jeln] 


| ys Pte > r+ x]dx. 
0 
] 


Jelm 


IA 


IA 
a 
N 


Now applying our tail bound yields 


E[ max ¢)? | <7 + / edx =t +6. (20.3) 
0 


jeln] 
Finally, we combine Equations 20.2 and 20.3 and obtain 


logm 


cost(P) = =| max ¢;| <4C+1+6=O0( 


jeln] 


log log logm ) 


which completes the proof of Theorem 20.15. 


Next we show that the upper bound given in Theorem 20.15 is tight by showing 
that for every number of machines there exists a game instance that matches the upper 
bound up to a constant factor. 


Theorem 20.18 — For every m €N, there exists an instance G of the load bal- 
ancing game with m machines and n < m tasks that has a Nash equilibrium 
strategy profile P with 


logm 


cost(P) = Q ( ) - opt(G). 


log log logm 


PROOF The starting point for our construction is the game and the Nash as- 
signment A from the proof of Theorem 20.9. We use mixed strategies in only 
one of the groups, namely in the group G, with k = [q/2]. Let M denote the 
number of machines in this group, i.e., M = q!/k! > (q/2)'4/2!. Observe that 
log M = O(q log g) = O(log m). 

Let T denote the set of tasks mapped by A to one of the machines in Gx. 
The tasks in T have weight 2". Each of these tasks is now assigned uniformly at 
random to a machine group Gx, i.e., p} = a for each j € Gx and eachi € T. 
For all other tasks the strategy profile P corresponds without any change to the 
pure strategy profile of assignment A. Observe that the randomization increases 
the expected cost of the tasks. The expected cost of a task i € T on a machine 
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j € Gx is now 


; ; ; 1 
} = Elej1+ (1 }) Saks 1 k+1. 
Ci [ej] Pj 5; M = 


J 


In the proof of Theorem 20.9, we have shown that the cost of a task i of weight 2* 
on a machine of group G, with j ¢ k is at least k + 1. Thus, the strategy profile 
P is a Nash equilibrium. 

It remains to compare the social cost of the equilibrium profile P with the 
optimal cost. The structure of the optimal assignment is not affected by the 
modifications. It has social cost opt(G) = 2. Now we give a lower bound for 
the social cost of P. This social cost is, obviously, bounded from below by the 
maximum number of tasks that are mapped to the same machine in the group 
G,. Applying Proposition 20.11 with M bins and N = kM balls shows that the 
expected makespan is 


InM logm 
Oe an | eee 
In(1+ 7InM) log loglogm 


where the last estimate holds as k = © (log m/ log log m) and log M = @(logm). 
This completes the proof of Theorem 20.18. 


20.6 Summary and Discussion 


In this chapter, we studied the price of anarchy in load balancing games in four different 
variants. Table 20.1 summarizes the results about the price of anarchy that we have 
presented. In the case of pure equilibria on identical machines, the price of anarchy is 
bounded from above by a small constant term. In all other cases, the price of anarchy 
is bounded from above by a slowly growing, sublogarithmic function in the number of 
machines. One might interpret these results as a first game theoretic explanation why 
the resources in a large distributed system like the Internet that widely lacks global 
control are shared in a more or less efficient and fair way among different users with 
different interests, although the considered model is clearly oversimplifying in several 
aspects. 

It is an interesting coincidence that both the price of anarchy for pure equilibria 
on uniformly related machines as well as the price of anarchy for mixed equilibria 


Table 20.1. The price of anarchy for pure and 
mixed equilibria in load balancing games on 
identical and uniformly related machines 


Identical Uniformly related 
2 logm 
Pure 2- m+1 io) (2 aa ) 


Mixed o( log ) @( logm ) 


log logm log log log m 
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on identical machines are of order log m/ log logm. Although both of these models 
result in essentially the same price of anarchy, the reasons for the increase in the social 
cost are quite different: In the case of pure equilibria on uniformly related machines, 
equilibrium assignments correspond to local optima with respect to moves of single 
tasks. That is, tasks are placed in a suboptimal but nevertheless coordinated fashion. 
On the contrary, in case of mixed equilibria, the increase in cost is due to collisions 
between uncoordinated random decisions. If one combines these two effects, then one 
loses only another very small factor of order log log m/ log log log m, which results 
in a price of anarchy of order log m/ log log log m for mixed equilibria on uniformly 
related machines. 

Obviously, the price of anarchy for load balancing games as we have defined them 
in the beginning of this chapter is well understood. As mentioned above, however, this 
model is very simplistic. To make these results more realistic, one needs to incorporate 
other aspects from practical application areas like, e.g., more realistic cost functions or 
other ways to define the social cost. We give pointers to studies of quite a few variants 
of load balancing games in the bibliographic notes. In Christodoulou et al. (2004), it 
is made an interesting attempt that adds an algorithmic or constructive element to the 
analysis of the price of anarchy. The idea behind so-called “coordination mechanisms” 
is not to study the price of anarchy for a fixed system, but to design the system in such 
a way that the increase in cost or the loss in performance due to selfish behavior is as 
small as possible. Similar aspects are also discussed in Chapter 17. We believe that 
this is a promising direction of research that might result in practical guidelines of how 
to build a distributed system that does not suffer from selfish behavior but might even 
exploit the selfishness of the agents. 

Besides the price of anarchy, we have studied the question of how agents reach a 
Nash equilibrium. We have observed that any sequence of improvement steps reaches 
a pure Nash equilibrium after a finite number of steps. In case of identical machines 
the max-weight best-response policy reaches an equilibrium in only O(n). In case of 
uniformly related machines, it is open whether there exists a short sequence of im- 
provement steps that lead from any given assignment to a pure Nash equilibrium. We 
think that this question is of great importance as Nash equilibria are only of interest 
if agents can reach them quickly. It is not clear that the only reasonable approach for 
the agents to reach a Nash equilibrium in a distributed way is to use improvement 
steps. There might also be other, possibly more strategic or more coordinated behav- 
ioral rules that quickly converge to a Nash equilibrium or to an approximate Nash 
equilibrium. For example, Chapter 29 considers some approaches from evolutionary 
game theory in the context of routing in networks. It is an interesting research prob- 
lem to design distributed protocols that ensure that agents reach a Nash equilibrium 
quickly. Pointers to first results toward this direction can be found in the bibliographic 
notes. 


20.7 Bibliographic Notes 


The concept of the price of anarchy was introduced by Koutsoupias and Papadimitriou 
(1999). In their seminal work, they study load balancing in form of a routing game 
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consisting of two nodes connected by parallel edges with possibly different speeds. 
Each agent has an amount of traffic that the agent seeks to map to one of the edges such 
that the load on this edge is as small as possible. In our notation, the parallel edges 
between the source and the sink correspond to the machines and the pieces of traffic 
of the agents correspond to the tasks. Let us remark that originally the ratio between 
the social cost in a worst-case Nash equilibrium and the optimal social cost was called 
coordination ratio but in this chapter we switched to the now commonly used term 
price of anarchy. The game theoretic model underlying the load balancing games is 
also known as KP model. 

The results presented in Table 20.1 have been obtained in the following studies. 
The upper bound of 2 — aT on the price of anarchy for pure equilibria in load 
balancing games with identical machines goes back to the scheduling literature (Finn 
and Horowitz, 1979), where the same ratio occurs in form of an approximation factor 
for a local search optimization heuristic. The lower bound on the price of anarchy for 
mixed equilibria on identical machines is presented in Koutsoupias and Papadimitriou 
(1999). The analysis for the corresponding upper bound is obtained in Czumaj and 
Vocking (2002) and Koutsoupias et al. (2003). Let us remark that the analysis in 
Czumaj and Vocking (2002) is tight up to a constant additive term. It shows that the 
price of anarchy for mixed equilibria in load balancing games on identical machines 
is [~!(m) + @(1). The upper and lower bounds on the price of anarchy for pure and 
mixed equilibria in load balancing games with uniformly related machines are from 
Czumaj and Vécking (2002) as well. This work also contains a tight characterization 
of the price of anarchy as a function of the ratio between the speeds of the fastest and 
the slowest machine. 

The existence proof for pure equilibria presented in Section 20.1.1 can be found in 
Fotakis et al. (2002) and Even-Dar et al. (2003). The result from Section 20.3.2 that 
the LPT algorithm computes a pure Nash equilibrium is presented in Fotakis et al. 
(2002) together with several further results about the complexity of computing pure 
and mixed equilibria in load balancing games. The uniqueness of the fully mixed Nash 
equilibrium is shown in Mavronicolas and Spirakis (2001). Exercise 20.5 reworks the 
nice proof for this result. The counterexample to the fully mixed Nash equilibrium 
conjecture presented in Section 20.4.1 is from Fischer and Vocking (2005). Finally, 
the results from Section 20.2.2 about the convergence of best response sequences are 
from Even-Dar et al. (2003). 

Let us remark that this chapter does by far not give a complete overview of the rich 
literature about different variants of games for load balancing or routing on parallel 
links. We conclude this chapter with a few pointers to further literature. Load balancing 
games with more general cost functions are considered, e.g., in Caragiannis et al. (2006), 
Czumagaj et al. (2002), Libman and Orda (1999, 2001). Other definitions of the social 
cost are considered, e.g., in Caragiannis et al. (2006), Gairing et al. (2004a, 2004b) and 
Suri et al. (2004). Another interesting variant of load balancing games assumes that 
agents come with subsets of the machines on which they have to place their tasks. The 
price of anarchy in such a restricted assignment model is investigated in Awerbuch et al. 
(2003), Gairing et al. (2006), and Suri et al. (2004). The price of anarchy with respect 
to equilibria that are robust against coalitions is studied in Andelman et al. (2007). 
An important aspect that we have only touched in this chapter is the complexity of 
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computing Nash equilibria for load balancing games. Further work dealing with the 
computation of Nash equilibria can be found, e.g., in Even-Dar et al. (2003), Feldmann 
et al. (2003a), Fotakis et al. (2002), Fischer and Vocking (2005), and Gairing et al. 
(2004a). Recent work deals also with the convergence time of distributed load balancing 
processes in which agents make parallel attempts for improvement steps until they find 
a Nash equilibrium (Berenbrink et al., 2006; Even-Dar and Mansour, 2005). Another 
interesting topic is load balancing games with incomplete information that have been 
considered, e.g., in Beier et al. (2004) and Gairing et al. (2005). Finally, let us remark 
that the concept of coordination mechanisms has been suggested in Christodoulou 
et al. (2004) and some further results on this topic can be found in Immorlica et al. 
(2005). 

Several other results for load balancing and routing on parallel links have been 
collected in the surveys (Czumaj, 2004; Feldmann et al., 2003b; Koutsoupias, 2003). 


Bibliography 


N. Andelman, M. Feldman, and Y. Mansour. Strong price of anarchy. In Proc. 18th Annual ACM-SIAM 
Symp. on Discrete Algorithms, 2007. 

B. Awerbuch, Y. Azar, Y. Richter, and D. Tsur. Tradeoffs in worst-case equilibria. In Proc. Ist 
International Workshop on Approximation and Online Algorithms (WAOA), pp. 41-52, 2003. 

R. Beier, A. Czumaj, P. Krysta, and B.Vécking. Computing equilibria for congestion games with 
(im)perfect information. In Proc. 15th Annual ACM-SIAM Symp. Discrete Algorithms, pp. 746— 
755, 2004. 

P. Berenbrink, T. Friedetzky, L.A. Goldberg, P.W. Goldberg, Z. Hu, and R.A. Martin. Distributed 
selfish load balancing. In Proc. 17th Annual ACM-SIAM Symp. Discrete Algorithms, pp. 354-363, 
2006. 

I. Caragiannis, M. Flammini, C. Kaklamanis, P. Kanellopoulos, and L. Moscardelli. Tight bounds 
for selfish and greedy load balancing. In Proc. 33rd Intl. Collog. on Automata, Languages, and 
Programming, pp. 311-322, 2006. 

G. Christodoulou, E. Koutsoupias, and A. Nanavati. Coordination Mechanisms. In Proc. 31st Intl. 
Collog. on Automata, Languages and Programming, pp. 345-357, 2004. 

A. Czumaj. Selfish Routing on the Internet. Chapter 42 in Handbook of Scheduling: Algorithms, 
Models, and Performance Analysis, edited by J. Leung, CRC Press, Boca Raton, FL, 2004. 

A. Czumaj, P. Krysta, and B. Vécking. Selfish traffic allocation for server farms. In Proc. 34th Annual 
ACM Symp. Theory of Computing, pp. 287-296, 2002. 

A. Czumaj and B. Vécking. Tight bounds for worst-case equilibria. In Proc. 13th Annual ACM-SIAM 
Symp. on Discrete Algorithms, pp. 413-420, 2002. 

E. Even-Dar, A. Kesselman, and Y. Mansour. Convergence time to Nash equilibria. In Proc. 30th 
International Collog. on Automata, Languages and Programming, pp. 502-513, 2003. 

E. Even-Dar and Y. Mansour. Fast convergence of selfish rerouting. In Proc. 16th Annual ACM-SIAM 
Symp. on Discrete Algorithms, pp. 772-781, 2005. 

R. Feldmann, M. Gairing, T. Liicking, B. Monien, and M. Rode. Nashification and the coordination 
ratio for a selfish routing game. In Proc. 30th International Colloq. on Automata, Languages and 
Programming, pp. 414426, 2003a. 

R. Feldmann, M. Gairing, T. Liicking, B. Monien, and M. Rode. Selfish routing in non-cooperative 
networks: a survey. In Proc. 28th International Symp. on Mathematical Foundations of Computer 
Science, pp. 21-45, 2003b. 


BIBLIOGRAPHY 541 


G. Finn and E. Horowitz. A linear time approximation algorithm for multiprocessor scheduling. BIT, 
19(3):312-320, 1979. 

S. Fischer and B. Vécking. On the structure and complexity of worst-case equilibria. In Proc. Ist 
Workshop on Internet and Network Economics, pp. 151-160, 2005. 

D. Fotakis, S. Kontogiannis, E. Koutsoupias, M. Mavronicolas, and P. Spirakis. The structure and 
complexity of Nash equilibria for a selfish routing game. In Proc. 29th Intl. Colloquium on 
Automata, Languages and Programming (ICALP), pp. 123-134, 2002. 

D.K. Friesen. Tighter bounds for LPT scheduling on uniform processors. SIAM J. Computing, 
16(3):554—560, 1987. 

M. Gairing, T. Liicking, M. Mavronicolas, and B. Monien. Computing Nash equilibria for scheduling 

on restricted parallel links. In Proc. 36th Annual ACM Symp. on Theory of Computing, pp. 613-622, 

2004a. 

M. Gairing, T. Liicking, M. Mavronicolas, and B. Monien. The price of anarchy for polynomial social 

cost. In Proc. 29th Intl. Symp. on Mathematical Foundations of Computer Science, pp. 574-585, 

2004b. 

M. Gairing, T. Liicking, M. Mavronicolas, B. Monien, and M. Rode. Nash equilibria in discrete 

routing games with convex latency functions. In Proc. 31st Intl. Collog. on Automata, Languages 

and Programming, pp. 645-657, 2004c. 

M. Gairing, T. Liicking, M. Mavronicolas, and B. Monien. The Price of Anarchy for Restricted 

Parallel Links. Parallel Process. Lett., 16(1):117—132, 2006. 

M. Gairing, B. Monien, and K. Tiemann. Selfish routing with incomplete information. In Proc. 17th 
Annual ACM Symp. on Parallel Algorithms, pp. 203-212, 2005. 

R.L. Graham. Bounds for certain multiprocessing anomalies. Bell System Tech. J., 45: 1563-1581, 
1966. 

R.L. Graham. Bounds on multiprocessing timing anomalies. SIAM J. Appl. Math., 17: 263-269, 1969. 

D.S. Hochbaum and D.B. Shmoys. A polynomial approximation scheme for scheduling on uniform 
processors. SIAM J. Computing, 17(3):539-551, 1988. 

N. Immorlica, L. Li, V.S. Mirrokni, and A. Schulz. Coordination mechanisms for selfish scheduling. 
In Proc. Ist Workshop on Internet and Network Economics, pp. 55-69, 2005. 

E. Koutsoupias. Selfish task allocation. Bulletin of the EATCS (81), pp. 79-88, 2003. 

E. Koutsoupias, M. Mavronicolas, and P. Spirakis. Approximate equilibria and ball fusion. Theory of 
Computing Systems, 36(6):683-693, 2003. 

E. Koutsoupias and C.H. Papadimitriou. Worst-case equilibria. In Proc. 16th Annual Symp. on 
Theoretical Aspects of Computer Science, pp. 404-413, 1999. 

L. Libman and A. Orda. The designer’s perspective to atomic noncooperative networks. IEEE/ACM 
Trans. Networking, 7(6):875-884, 1999. 

L. Libman and A. Orda. Atomic resource sharing in noncooperative networks. Telecommun. Systems, 
17(4):385—409, 2001. 

M. Mavronicolas and P. Spirakis. The price of selfish routing. In Proc. 33rd ACM Symp. on Theory 
of Computing, pp. 510-519, 2001. 

M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Proba- 


bilistic Analysis. Cambridge University Press, 2005. 
S. Suri, C. Toth, and Y. Zhou. Selfish load balancing and atomic congestion games. In Proc. 16th 
Annual ACM Symp. on Parallel Algorithms and Architectures, pp. 188-195, 2005. 


542 


20.2 


20.3 


20.4 


20.5 


20.6 


SELFISH LOAD BALANCING 


Exercises 


Let G be any instance of the load balancing game with three tasks that should be 
placed on two identical machines. Show that any pure Nash equilibrium for G is 
optimal, i.e., cost(A) = opt(G) for any equilibrium assignment A. 

Remark: Interestingly, the example presented in Section 20.1.2 that yields the 
worst-case price of anarchy for two identical machines uses only four tasks. 


Show, for every me€N, there exists an instance G of the load balancing game 
with m identical machines and 2m tasks that has a Nash equilibrium assignment 
A: [n] > [m] with 


2 
cost(A) = (2 — —) - opt(G). 


Hint: Generalize the example with two machines given in Section 20.1.2. 


Prove that the price of anarchy for pure equilibria on instances of the load balancing 
game with two tasks and two machines with possibly different speeds corresponds 
to the golden ratio ¢ = $(1 + V5). That is, show that 


a) there is a game instance G admitting an equilibrium assignment A with 
cost(A) = @ - opt(C). 

b) for every game instance G and every equilibrium assignment A for this instance, 
it holds cost(A) < @ - opt(C). 


Consider an instance of the load balancing game with two tasks both of which 
have weight 1 and two machines, one of speed 1 and the other of speed s > 0. 


(a) Show that there does not exist a fully mixed Nash equilibrium if s < 
eee 

(b) Show that there exists a unique fully mixed Nash equilibrium if + <s <2. 
Describe the strategy profile of this equilibrium as a function of s. 


1 
7 OF 


Show that there exists at most one fully mixed Nash equilibrium for every instance 
of the load balancing game. 

Hint: Describe the conditions on the probabilities p/ imposed by a fully mixed 
Nash equilibrium in form of a system of linear equations and show that this system 
has a unique solution. If all the values for the variables p/ in this solution are 
positive then the solution describes a fully mixed Nash equilibrium. Otherwise, 
there does not exist a fully mixed equilibrium. 


Suppose that we are given an instance G of the load balancing game with m 
identical machines and n tasks whose weights are bounded from above by a- 
opt(G), forO <a <1. 


(a) Show that cost(A) < (1 + @)- opt(G), for every equilibrium assignment A. 
(b) Leta = om Show that cost(A) = O(opt(G)), for every equilibrium strategy 


profile P. 


CHAPTER 21 


The Price of Anarchy and the 
Design of Scalable Resource 
Allocation Mechanisms 


Ramesh Johari 


Abstract 


In this chapter, we study the allocation of a single infinitely divisible resource among multiple 
competing users. While we aim for efficient allocation of the resource, the task is complicated by the 
fact that users’ utility functions are typically unknown to the resource manager. We study the design 
of resource allocation mechanisms that are approximately efficient (i.e., have a low price of anarchy), 
with low communication requirements (1.e., the strategy spaces of users are low dimensional). 

Our main results concern the proportional allocation mechanism, for which a tight bound on 
the price of anarchy can be provided. We also show that in a wide range of market mechanisms 
that use a single market-clearing price, the proportional allocation mechanism minimizes the price 
of anarchy. Finally, we relax the assumption of a single market-clearing price, and show that by 
extending the class of Vickrey—Clarke—Groves mechanisms all Nash equilibria can be guaranteed to 
be fully efficient. 


21.1 Introduction 


This chapter deals with a canonical resource allocation problem. Suppose that a finite 
number of users compete to acquire a share of an infinitely divisible resource of fixed 
capacity. How should the resource be shared among the users? We will frame this 
problem as an economic problem: we assume that each user has a utility function that 
is increasing in the amount of the resource received, and then design a mechanism 
to maximize aggregate utility. In the absence of any strategic considerations, this is a 
simple optimization problem; however, if we assume that the agents are strategic, we 
need to design the resource allocation mechanisms to be robust to gaming behavior. 
A central theme of this chapter is that the price of anarchy can be used as a design 
metric; i.e., “robust” allocation mechanisms are those that have a low price of anarchy. 
The present chapter is thus a bridge between two different themes of the book. The 
first theme is that of optimal mechanism design (Part II): given selfish agents, how do 
we successfully design mechanisms that nevertheless yield efficient outcomes? The 
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second theme is that of quantifying inefficiency (Part III): given a prediction of game 
theoretic behavior, how well does it perform relative to some efficient benchmark? In 
this chapter, we use the quantification of inefficiency as the “objective function” with 
which we will design optimal mechanisms. As we will see, for the resource allocation 
problems we consider, this approach yields surprising insights into the structure of 
optimal mechanisms. 

The mechanisms we consider for resource allocation are motivated by constraints 
present in modern communication networks, and similar systems where communication 
is limited; this precludes use of the traditional Vickrey—Clarke—Groves mechanisms 
(Chapter 9), which require declaration of the entire utility function. If we interpret the 
single resource above as a communication link, then we view the mechanism as an 
allocation policy operating on that link. We wish to design mechanisms that, intuitively, 
impose low communication overhead on the overall system; throughout this chapter, 
that scalability constraint translates into the assumption that the players can use only 
low-dimensional (in fact, one-dimensional) strategy spaces. 

The remainder of the chapter is organized as follows. In Section 21.2, we introduce 
the basic resource allocation model we will consider in this chapter, and then introduce 
a simple approach to allocating the fixed resource: the proportional allocation mecha- 
nism. In this mechanism, each user submits a bid, and receives a share of the resource 
in proportion to their bid. We analyze this model under both the assumption that users 
are price takers (i.e., that they do not anticipate the effect of their strategic decision 
on the price of the resource); and the assumption that users are price anticipators. 
The former case yields full efficiency, while in the latter we characterize the price of 
anarchy. In Section 21.3, we state and prove a theorem showing that in a nontrivial 
class of “scalable” market mechanisms (in the sense informally discussed above), the 
proportional allocation mechanism has the lowest price of anarchy (i.e., minimizes the 
efficiency loss) when users are price anticipating. 

In all the mechanisms considered in the first two sections, players have one- 
dimensional strategy spaces, and the mechanism also only chooses a single price. 
Because of these constraints, even the highest performance mechanisms suffer a posi- 
tive efficiency loss, as demonstrated in Section 21.3. In the final section of the chapter, 
we consider the implications of removing the “single price” constraint. We show in 
Section 21.4 that if we consider mechanisms with scalar strategy spaces, and allow the 
mechanism to choose one price per user of the resource, then in fact full efficiency is 
achievable at Nash equilibrium. The result involves extending the well-known class of 
Vickrey—Clarke—Groves (VCG) mechanisms to use only a scalar strategy space; for 
more on VCG mechanisms, see Chapter 9. 


21.2 The Proportional Allocation Mechanism 


Suppose that R users share a resource of capacity C > 0. Let d, denote the amount 
allocated to user r. We assume that user r receives a utility equal to U,(d,) if the 
allocated amount is d,; we assume that utility is measured in monetary units. We make 
the following assumptions on the utility function; we emphasize that this assumption 
will be in force for the duration of the chapter, unless otherwise mentioned. 
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Assumption 1 For each r, over the domain d, > 0 the utility function U,(d,) is 
concave, strictly increasing, and continuous; and over the domain d, > 0, U,(d,) is 
continuously differentiable. Furthermore, the right directional derivative at 0, denoted 
U/(0), is finite. We let U/ denote the set of all utility functions satisfying these conditions. 


We note that we make rather strong differentiability assumptions here on the utility 
functions; these assumptions are primarily made to ease the presentation. It is possible 
to relax the differentiability assumptions (see Notes for details). 

Given complete knowledge and centralized control of the system, a natural problem 
for the network manager to try to solve is the following optimization problem: 


SYSTEM: 
maximize > U,(d,) (21.1) 
subject to ad <C; (21.2) 
d.->0, r=1,...,R. (21.3) 


Note that the objective function of this problem is the utilitarian social welfare function 
(cf. Chapter 17); it becomes a reasonable objective if we assume that all utilities are 
measured in the same (monetary) units. Since the objective function is continuous 
and the feasible region is compact, an optimal solution d = (d),..., dr) exists. If the 
functions U,. are strictly concave, then the optimal solution is unique, since the feasible 
region is convex. 

In general, the utility functions are not available to the resource manager. As a result, 
we consider the following pricing scheme for resource allocation, which we refer to as 
the proportional allocation mechanism. Each user r gives a payment (also called a bid) 
of w, to the resource manager; we assume w, > O. Given the vector w = (w),..., W;), 
the resource manager chooses an allocation d = (d,,..., d,.). We assume the manager 
treats all users alike—in other words, the network manager does not price discriminate. 
Each user is charged the same price jz > 0, leading to d, = w,/. We further assume 
that the manager always seeks to allocate the entire resource capacity C; in this case, 
we expect the price ju to satisfy 


yi =c. 


, 
The preceding equality can only be satisfied if }°. w, > 0, in which case we have 


_ de Wr 


In other words, if the manager chooses to allocate the entire resource, and does not 
price discriminate between users, then for every nonzero w there is a unique price 
jt > 0, which must be chosen by the network, given by the previous equation. 

We can interpret this mechanism as a market-clearing process by which a price is set 
so that demand equals supply. To see this interpretation, note that when a user chooses 
a total payment w,, it is as if the user has chosen a demand function D(p, w;) = w,/ Pp 


LL (21.4) 
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for p > 0. The demand function describes the quantity the user demands at any given 
price p > 0. The resource manager then chooses a price jz so that )°,, D(w, w,) = C, 
Le., so that the aggregate demand equals the supply C. For the specific form of demand 
functions we consider here, this leads to the expression for given in (21.4). User r 
then receives an allocation given by D(j, w,), and makes a payment D(u, w;) = wy. 
This interpretation will be further explored in Section 21.3, where we consider other 
market-clearing mechanisms for allocating a single resource in inelastic supply, with 
the users choosing demand functions from a family parameterized by a single scalar. 


21.2.1 Price Taking Users and Competitive Equilibrium 


In this section, we consider a competitive equilibrium between the users and the resource 
manager. A central assumption in the definition of competitive equilibrium is that each 
user does not anticipate the effect of their payment w, on the price jz; 1.e., each user 
acts as a price taker. In this case, given a price jz > O, user r acts to maximize the 
following payoff function over w, > 0: 


P,(w,3 h) = U, (=) — w,. (21.5) 


The first term represents the utility to user r of receiving a resource allocation equal 
to w,/; the second term is the payment w, made to the manager. Observe that this 
definition is consistent with the notion that all utilities are measured in monetary units. 

We now say a pair (w, jz) with w > 0 and wu > 0 is a competitive equilibrium if 
users maximize their payoff as defined in (21.5), and the network “clears the market” 
by setting the price yz according to (21.4): 


P.(w,; 4) = P-(d,; 4) forw,>0, r=1,...,R; (21.6) 
Ler Wr 

== 21.7 

Mh C (21.7) 


The following theorem shows that under our assumptions, a competitive equilibrium 
always exists, and any competitive equilibrium maximizes aggregate utility. 


Theorem 21.1 = There exists a competitive equilibrium (w, 2). In this case, the 
vector d = w/, is an optimal solution to SYSTEM. 


PROOF The key idea in the proof is to use Lagrangian techniques to establish that 
optimality conditions for (21.6)—(21.7) are identical to the optimality conditions 
for the problem SYSTEM, under the identification d = w/w. 

Observe that under Assumption 1, the payoff (21.5) is concave in w, for any 
jt > 0. Thus considering the first-order condition for maximization of P,(w,; (4) 
over w, > 0, we conclude w and jz are a competitive equilibrium if and only if 


Ud.) =p; itd-0; (21.8) 
U'(0) <p, ifd, =0; (21.9) 
Sea, (21.10) 
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where d, = w,/. A straightforward Lagrangian optimization shows that the pre- 
ceding conditions are exactly the optimality conditions for the problem SYSTEM, 
so we conclude w and yz are a competitive equilibrium if and only if d = w/ is 
a solution to SYSTEM with Lagrange multiplier jz. Since at least one solution to 
SYSTEM must exist, the proof is complete. 


Theorem 21.1 shows that under the assumption that the users of the resource behave 
as price takers, there exists a bid vector w where all users have optimally chosen their 
bids w,, with respect to the given price w = )>,, w,/C; and at this “equilibrium,” 
aggregate utility is maximized. However, when the price taking assumption is violated, 
the model changes into a game and the guarantee of Theorem 21.1 is no longer valid. 
We investigate this game in the following section. 


21.2.2 Price Anticipating Users and Nash Equilibrium 


We now consider an alternative model where the users of a single resource are price 
anticipating, rather than price takers. The key difference is that while the payoff function 
P, takes the price yz as a fixed parameter in (21.5), price anticipating users will realize 
that jz is set according to (21.4), and adjust their payoff accordingly; this makes the 
model a game between the R players. 

We use the notation w_, to denote the vector of all bids by users other than r; 
Le., W_, = (Wj, W2,..., Wr_1, Wt, -.., WR). Given w_,, each user r chooses w, to 
maximize: 


U, (<-c) —w,, ifw, > 0; 


U,(0), if w, = 0. 


Q,(w;; Wr) = (21.11) 


over nonnegative w,. The second condition is required so that the resource allocation to 
user r is zero when w, = 0, even if all other users choose w_, so that )°, fr Ws = 0. The 
payoff function Q, is similar to the payoff function P,, except that the user anticipates 
that the network will set the price 4 according to (21.4). A Nash equilibrium of the 
game defined by (Q,..., Or) is a vector w > O such that for all r: 


Q,(w;; W_,) = Q-(w,;w_,), for all w, = 0. i12) 


Note that the payoff function in (21.11) may be discontinuous at w, = 0, if 
»'s¢r Ws = 0. This discontinuity may preclude existence of a Nash equilibrium; it 
is easy to see this in the case where the system consists of only a single user with a 
strictly increasing utility function. Nevertheless, as long as at least two users are com- 
peting, it is possible to show that a unique Nash equilibrium exists, by noting that such 
an equilibrium solves a version of the SYSTEM problem but with “modified” utility 
functions. 


Theorem 21.2. Suppose that R > 1. Then there exists a unique Nash equilib- 
rium w > 0 of the game defined by (Q1,..., Qr), and it satisfies }°. w, > 0. In 
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this case, the vector d defined by 


Wy 
d, 


Z s Ws 


is the unique optimal solution to the following optimization problem: 


Co reiaackR (21.13) 


GAME: 
maximize )° U,(d;) (21.14) 
subject to Ve <C; (21.15) 
420, rH R (21.16) 
where 
Paye (t= \ tray = Lf used (21.17) 
r\Gy>) = C r\ar C d, : rhZ Zz]. F 


PROOF The proof is similar to the proof of Theorem 21.1. The first key step 
is to note that at any Nash equilibrium, at least two components of w must be 
positive; this follows from the payoff (21.11) (see Exercise 17.5). Given this fact, 
the payoff of each user w, is strictly concave and continuous in w, so that w is a 
Nash equilibrium if and only if the following first-order conditions hold: 


7 ered | een ee em rey (21.18) 
ee Ws 26 Ws C 


U‘(0) < a ai 
ieeia 6: 


if w, = 0. (21.19) 


Note that if we define p = )°. w;/C and d, = w,/p, then the preceding condi- 
tions can be rewritten as 


Ud) =p, ifd, >0; (21.20) 
U'0) <p, ifd, =0; 121) 
Soa. (21.22) 


r 


Note that these are identical to (21.8)—(21.10), but for the modified objective func- 
tion (21.14). Since the utility functions U, (d,) are strictly concave and continuous 
over 0 < d, < C, the preceding first-order conditions are sufficient optimality 
conditions for GAME. We conclude that w is a Nash equilibrium if and only if 
>>, Ws > 0, and the resulting allocation d solves the problem GAME with La- 
grange multiplier p = )°. w;/C. To conclude the proof, observe that GAME has 
a strictly concave and continuous objective function over a compact feasible re- 
gion, and thus has a unique optimal solution. It is straightforward to verify that 
this implies uniqueness of the Nash equilibrium as well. 


Note that the preceding theorem gives a form of “potential” for the game under 
consideration: the Nash equilibrium is characterized as the unique solution to a natural 
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optimization problem. However, the objective function for this optimization problem 
is not a true (exact or ordinal) potential for the game under consideration; this is 
because while the objective function (21.14) depends on allocations, the users’ strategic 
decisions are bids. Notably, this observation is in sharp contrast to the potentials found 
for routing games in Chapter 18, or for network formation in Chapter 19. For example, 
we cannot use the objective function (21.14) to conclude that best response dynamics 
will converge for our game. Nevertheless, the optimization formulation will help us 
study the price of anarchy of the game in the following section. For later reference, 
we note the following corollary, which uses a variational inequality formulation of the 
preceding theorem. 


Corollary 21.3. Suppose that R > 1. Let w be the unique Nash equilibrium of 
the game defined by (Q,..., Or), and define d according to (21.13). Then for 
any other vector d > 0 such that Dee d, < C, there holds: 


S > Oddy — dy) <0. (21.23) 


PROOF The stated condition follows easily from (21.20)—(21.22), the optimality 
conditions for the problem GAME. 


21.2.3 Price of Anarchy 


We let d* denote an optimal solution to SYSTEM, and let d© denote the unique optimal 
solution to GAME. We now investigate the price of anarchy of this system; i.e., how 
much utility is lost because the users are price anticipating? To answer this question, we 
must compare the utility }>,. U, Gi ) obtained when the users fully evaluate the effect 
of their actions on the price, and the utility )~, U,(d5) obtained by choosing the point 
that maximizes aggregate utility. (We know, of course, that }°. Uae ae Ua ), 
by definition of d°.) As we show in the following theorem, the efficiency loss is exactly 
25% in the worst case. 


Theorem 21.4 Suppose that R > 1. Suppose also that U,(O) = 0 for all r. If 
d® is any optimal solution to SYSTEM, and d° is the unique optimal solution to 
GAME, then: 


Y Gla’) = = U.Ca’). 


Furthermore, this bound is tight: for every € > 0, there exists a choice of R, and 
a choice of (linear) utility functions U,,r =1,..., R, such that 


Duy <(F+e) (x ud) 
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PROOF Our proof will rely on the following constant p:! 


B = inf inf inf COD OES 2) 


¢ (21.24) 
UcU C>00<d.d<C U(d) 


Recall the definition of / in Assumption 1, and of U in (21.17). 

Our proof involves using Corollary 21.3 to prove that 6 is a tight bound on the 
efficiency of Nash equilibria. We first establish that 8 > 3/4. Note that in (21.24), 
the quotient is strictly larger than 1 if d >d, and equal to 1 if d =d. Thus in 
computing 8 we can assume that d < d in (21.24). We then have: 


U(d) + U'(d\(d — d) = U(d) + U'(d) (1 a *) (d —d) 


> U(d)+ (1 — =) (U(d) — Ud) 


BN igs ee 
> (5) ua+(1-5)u@ 
Bie ck 
= {U@. 


The first inequality follows since d < C and U is concave. The second inequality 
follows since U is concave and nonnegative and d < d, so U(d) > (d/d)U(d). 
Finally, the third inequality follows since x? — x + 1 is minimized at x = 1/2. It 
follows from (21.24) that 6 > 3/4. 

Next, we show that for any 6 > 0, there exists an example where the ratio 
of Nash aggregate utility to maximum aggregate utility is at least 6 + 6. Our 
approach is essentially the same as that in Example 17.6. Fix U, d < d, and let 
C = d. Consider the following example. Suppose that R > 1 users compete for the 
resource. Let user | have utility function U; = U, and suppose users 2,..., R have 
linear utility functions with slope U'(d); ie., U-(d,) = U'(d)d, = (U'(d)\(1 — 
d/C))d,. Let d° denote an optimal solution to SYSTEM for this model; since 
one feasible solution involves allocating the entire resource d to user 1, we must 
have >>, U,(d8 ) > U(d). On the other hand, recall that at any Nash equilibrium 
at least two users have positive quantities; and since the Nash equilibrium is 
unique, we conclude that all users 2,..., R receive the same positive quantity. 
Thus as R — oo, we must have d, | 0 forr = 2,..., R. From (21.20)-(21.21), 
it follows that the Nash price }>, w,/C must converge to U'(d) as R > oo. Thus, 
at the Nash equilibrium, user | receives an allocation d + e, and all other users 
receive an allocation (1 — d — €)/(R — 1), where € > 0 as R > oo. The total 
Nash utility thus converges to U(d) + U'(d)(d — d). The limiting ratio of Nash 
aggregate utility to maximum aggregate utility is thus less than or equal to 


U(d) + U'(d)d — d) 
U(d) | 


' A slight subtlety arises in this definition if U(%) = 0; however, in this latter case we can define f by only taking 
the infimum over x > 0. This does not change any of the subsequent arguments. 
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We conclude that for any 6 > 0, there exists a game (Q),..., Qr) in which the 
ratio of Nash aggregate utility to maximum aggregate utility is at most 6B + 6. 
By considering the special case in which U(d) = d, d = 1/2, and d = 1, the 
preceding construction yields a limiting efficiency ratio of exactly 3/4. Combined 
with the previous argument that 6 > 3/4, it follows that in fact 8 = 3/4. 

It remains to show that the bound holds for every resource allocation game. 
Here we simply apply the result of Corollary 21.3. Let(Q,,..., Or) be aresource 
allocation game where users have utility functions (U;,..., Up). Let d° be a 
solution to SYSTEM, and let d° be a solution to GAME. We have 


PREC Ee ; (U.(d2) + O(dS (as - a’) < ; YG). 


The first inequality follows by the definition of 6, and the second follows from 
Corollary 21.3. Since B = 3/4, this concludes the proof. 


The preceding theorem shows that in the worst case, aggregate utility falls by no 
more than 25% when users are able to anticipate the effects of their actions on the 
price of the resource. Furthermore, this bound is essentially tight. In fact, it follows 
from the proof that the worst case consists of a resource of capacity 1, where user 
1 has utility U;(d,) = d,, and all other users have utility U,(d,) © d,/2 (when R is 
large). As R — on, at the Nash equilibrium of this game user | receives a quantity 
d r = 1/2, while the remaining users uniformly split the quantity 1 — d i = 1/2 among 
themselves, yielding an aggregate utility of 3/4. On the other hand, the maximum 
aggregate utility possible is clearly 1, achieved by allocating the entire resource to 
user 1. 


21.3 A Characterization Theorem 


In this chapter we ask an axiomatic question: Is the mechanism we have chosen 
“desirable” among a class of mechanisms satisfying certain “reasonable” properties? 
Defining desirability is the simpler of the two tasks: we consider a mechanism to be 
desirable if it minimizes efficiency loss when users are price anticipating. Importantly, 
we ask for this efficiency property independent of the characteristics of the market 
participants (i.e., their cost functions or utility functions). That is, the mechanisms 
we seek are those that perform well under broad assumptions on the nature of the 
preferences of market participants. 

How do we define “reasonable” mechanisms? The most important condition we 
impose is that the strategy space of each market participant should be “simple,” which 
we interpret as /ow dimensional. Formally, we will focus on mechanisms for which the 
strategy space of each market participant is R*‘; i.e., each market participant chooses a 
scalar, which is a parameter that determines his demand function as input to the market- 
clearing mechanism. The primary motivation is that if we view such a mechanism to 
be useful for a communication network setting, information flow is limited; and in 
particular, we would like to implement a market with as little overhead as possible. 
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Thus keeping the strategy spaces of the users low dimensional is a reasonable goal.” 
We will show that under a specific set of mathematical assumptions, the proportional 
allocation mechanism in fact minimizes the worst-case efficiency loss when users are 
price anticipating. 

The class of market mechanisms we will consider is defined as follows. A market 
mechanism must operate on a particular environment, defined by a triple (C, R, U): 
C > 0 denotes the capacity of the resource; R > 1 denotes the number of users sharing 
the resource; and U = (Uj,..., Ur) denotes the utility functions of the users, with 
U, €U (cf. Assumption 1). The following definition captures our notion of a market 
mechanism. 


Definition 21.5 =A smooth market-clearing mechanism is a differentiable func- 
tion D: (0, 00) x [0, 00) — RP? such that for all C > 0, for all R > 1, and for 
all nonzero @ € (R*)*, there exists a unique solution p > 0 to the following 
equation: 


R 
>. PO. 6) = C. 
r=1 

We let pp(0) denote this solution.* 


Note that the market-clearing price is undefined if # = 0. As we will see below, when 
we formulate a game between users for a given mechanism D, we will assume that 
the payoff to all players is —oo if the composite strategy vector is 0 = 0. Note that 
this is slightly different from the definition in Section 21.1, where the payoff is U(0) 
to a player with utility function U who submits a strategy 0 = 0. We will discuss this 
distinction further later; we simply note for the moment that it does not affect the results 
of this section. 

Our definition of a smooth market-clearing mechanism generalizes the demand 
function interpretation of the proportional allocation mechanism. Recall that for that 
mechanism, each user submits a demand function of the form D(p, 0) = 0/p, and the 
link manager chooses a price pp(@) to ensure that yore , D(p, 9,) = C. Thus, for this 
mechanism, we have pp(@) = wart 6,/C if 0 #0. 

We now generalize competitive equilibria and Nash equilibria to this setting. 


Definition 21.6 Given a utility system (C, R, U) and a smooth market-clearing 
mechanism D, we say that a nonzero vector 0 € (Rt)* is a competitive equilib- 
rium if, for 4 = pp(@), there holds for all r: 


0, € arg max[U,-(D(u, 6,)) — wD(u, 6,)]. (21.25) 


? Note that this notion is distinct from “single-parameter domains” as studied in Chapter 9; there it is the true 
valuations of the agents that are one-dimensional, whereas here the true valuations of the agents may be arbitrary 
functions. With one-dimensional strategy spaces, we restrict the ability of users to communicate information 
about their valuations to the mechanism. 

3 Note that we suppress the dependence of this solution on C; where necessary, we will emphasize this dependence. 
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Definition 21.7 Given a utility system (C, R, U) and a smooth market-clearing 
mechanism D, we say that a nonzero vector 0 € (R*)* is a Nash equilibrium if 
there holds for all r: 


6, € argmax O,(6,;0_,). (21.26) 
6,.>0 


where 


D — pp(@)D Le abe 


Notice that the payoff Q, is —oo if the composite strategy vector is # = 0, since in this 
case no market-clearing price exists. 

We are now ready to frame the specific class D of market mechanisms we will 
consider in this section, defined as follows. 


(21.27) 


Definition 21.8 |The class D consists of all functions D(p, 6) such that the 
following conditions are satisfied: 


(i) D is asmooth market-clearing mechanism (cf. Definition 21.5). 


(ii) For all C > 0, and for all U, € WU, a user’s payoff is concave if he is price 
anticipating; i.e., for all R, and for all 0_, € (R*+)*, the function: 


U,(D(po(), @-) — po(9)D(po), 4,) 
is concave in 6, > 0 if @_, = 0, and concave in 6, > Oif 0_, 4 0. 
(iii) For all p > 0, and for all d > 0, there exists a@ > 0 such that D(p, 0) = d. 


(iv) The demand functions are nonnegative; i.e., forall p > Oand@ > 0, D(p, 6) => 0. 


We pause here to briefly discuss the conditions in the previous definition. The 
second allows us to characterize Nash equilibria in terms of only first-order conditions. 
To justify this condition, we note that some assumption of quasiconcavity is generally 
used to guarantee existence of pure strategy Nash equilibria. The third condition ensures 
that given a price p and desired allocation d € [0, C], each player can make a choice of 
@ to guarantee precisely the allocation d. This is an “expressiveness” condition on the 
mechanism that ensures that all possible demands can be chosen at any market-clearing 
price. The last condition is a normalization condition, which ensures that regardless of 
the bid of a user, he is never required to supply some quantity of the resource (which 
would be the case if we allowed D(p, 0) < 0). The following example gives a family 
of mechanisms that lie in D. 


Example 21.9 Suppose that D(p, 9) = 0p~!/°, where c > 1. Itis easy to check 
that this class of mechanisms satisfies D € D for all choices of c; when c = 1, 
we recover the proportional allocation mechanism of Section 21.2. The market- 
clearing condition yields that pp(@) = ()°,.6,/C )'/°, Note that as a result, the 
allocation to user r at a nonzero vector @ is 


6, 
D(Pv(8). 61) = gC. 
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In other words, regardless of the value of c, the market clearing allocations are 
chosen proportional to the bids. This remarkable fact is a special case of a more 
general result we establish below: all mechanisms in D yield market-clearing 
allocations that are proportional to the bids; they differ only in the market-clearing 
price that is chosen. The exercises study the price of anarchy of the mechanisms 
defined in this example using an approach analogous to the proof of Theorem 21.4. 


Our interest is in the worst-case ratio of aggregate utility at any Nash equilibrium 
to the optimal value of SYSTEM. Formally, for D € D we define a constant p(D) as 
follows: 

Dire Ur(D(pv9), 4,)) 
PDA ER CD 


d solves SYSTEM, and @ is a Nash eaulirium). 


Cs0,R>1,0eu", 


p(D) = in| 


Note that since all U € U are strictly increasing and nonnegative, the aggregate utility 

yy U,(d°) is positive for any utility system (C, R, U) with C > 0, and any optimal 

solution d° to SYSTEM. Note also that we are considering the ratio over all possible 

Nash equilibria, not just the best one for a given instance; thus, we are studying the 

price of anarchy, not the price of stability (cf. Chapter 17). However, Nash equilibria 

may not exist for some utility systems (C, R, U); in this case we set 0(D) = —oo. 
Our main result in this section is the following theorem. 


Theorem 21.10 Let D € D be a smooth market-clearing mechanism. Then: 
(i) There exists a competitive equilibrium 0. Furthermore, for any such 0, the re- 
sulting allocation d given by d, = D(pp(8), 6,) solves SYSTEM. 
(ii) There exists a concave, strictly increasing, differentiable, and invertible function 
B: (0, 00) > (0, 00) such that for all p > 0 and 6 > 0: 


D(p, 0) = ——. 

B(p) 

(iti) p(D) < 3/4, and this bound is met with equality if and only if D(p, 0) = A@/p 
for some A > 0. 


Before continuing to the proof of the theorem, we pause to make several critical 
comments about the result. Results (i) and (ii) of the theorem are a characterization of 
the types of mechanisms allowed by the constraints that define D. In particular, notice 
that from (ii), for nonzero @ we have 


R 
6, 
B(pp(9)) = Mra ; (21.28) 


Thus we must have 


6, 


D(pv(8). 6) = 5 


C; (21.29) 
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in other words, every mechanism in D chooses allocations in proportion to the bids. 
As aresult, we conclude that for a given vector 8, when the market clears, mechanisms 
in DP differ from the proportional allocation mechanism only in the market-clearing 
price—the allocation is the same. Result (iii) of the theorem is then a price of anarchy 
result that concerns mechanisms of this form. 

We emphasize that the theorem here is distinguished from related work because 
the allocation rule (21.29) was not assumed in advance. Rather, the result here starts 
from a set of simple assumptions on the structure of mechanisms to be considered (the 
definition of the class D), and uses them to prove that any mechanism in the class must 
lead to the allocation in (21.29). (See Notes for details.) 


PROOF Throughout the proof we fix a particular mechanism D € D. Some 
computational details are left to the reader. 


Step 1: A user’s payoff is concave if he is price taking. In other words, we will 
show that for all U € YU and for all p > 0, U(D(p, 9)) — pD(p, @) is concave in 
0. The key idea is to use a limiting regime where capacity grows large, so that 
users that are price anticipating effectively become price taking. 

Formally, we first observe that since D must possess a unique market-clearing 
price regardless of the value of C, D(p, 8) must be strictly monotonic in p (for 
fixed 6 > 0) where it is nonzero, and either (1) D(p, @) is nondecreasing in p for 
all 9 > 0, or (2) D(p, 9) is nonincreasing in p for all 9 > 0. 

To complete the proof of this step, fix 4 > O, and fix 6 > 0. Now consider a 
limit where R > 00, and C® = RD(, 0) is the capacity in the R’th system. 
It is straightforward to check that if the R—1 users 2,..., R submit strategy 
0, and the first user submits strategy 6’, then the resulting market-clearing price 
Pp converges to 4 as R > ov, regardless of the value of 6’. This step uses the 
fact that either (1) or (2) above holds. Applying the fact that player 1’s payoff 
must be concave when he is price anticipating and taking limits as R > o, it 
follows that player 1’s payoff is concave when he is price taking for any fixed 
price pu > 0. 


Step 2: There exists a positive function B such that D(p, 0) = 0/B(p) for 
p > Oand@ > 0. By Step 1, a player’s payoff is concave when he is price taking. 
By appropriately choosing a linear utility function with very large slope and very 
small slope, it follows that D(p, @) must be concave and convex, respectively, in 
6 for a given p > 0. Thus for fixed p > 0, D(p, @) is an affine function of 6. 
Conditions 3 and 4 in Definition 21.8 then imply that the constant term must be 
zero, while the coefficient of the linear term is positive; thus, D(p, 0) = 0/B(:p) 
for some positive function B(p). 


Before continuing, we note that the previous step already implies the remark- 
able fact that for any mechanism D € D, the allocation at the market-clearing 
price is made in proportion to the bids @. This follows from the discussion 
following (21.28) above. 
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Step 3: For all utility systems (C, R, U), there exists a competitive equilibrium, 
and it is fully efficient. This step follows primarily because of Condition 3 in 
Definition 21.8: given a price jz, a user can first determine his optimal choice 
of quantity, and then choose a parameter @ to express this choice. Formally, 
suppose that 4 = pp(@), and (21.25) holds. Let d. = D(w, 0,); then (21.25) 
implies that the necessary conditions (21.8)—(21.9) hold; these are also sufficient 
because of Step 1. Furthermore, market clearing implies (21.10) holds. Thus 
any competitive equilibrium is fully efficient. Existence follows by letting d° 
be a solution to SYSTEM with Lagrange multiplier jz, and choosing 6, = d,/B(). 


Step 4: For all R > 1 and 0_, € (R*)"!, the functions D(pp(@), 0,) and 
—pp(0)D(pp(8), 6,.) are concave in 0, > 0 if 0_, = 90, and concave in 6, > 0 
if 0_, #9. As in Step 2, this conclusion follows by considering linear utility 
functions with very large and very small slope, respectively. 


Step 5: B is an invertible, differentiable, strictly increasing, and concave 
function on (0, co). We immediately see that B must be invertible on (0, 0c); it 
is clearly onto, as the right-hand side of (21.28) can take any value in (0, 00). 
Furthermore, uniqueness of the market-clearing price in (21.28) requires that B 
is one-to-one as well, and hence invertible. Since D is differentiable, B must be 
differentiable as well. Let ® denote the differentiable inverse of B on (0, co); we 
will show ® is strictly increasing and convex. 

Let 


Bb, 6, 
w,(0) = po(9)D(po(9), 6) = ® Disa 4 - C). (21.30) 
C pay 


By Step 4, w,(@) is convex in 6, > 0. By considering strategy vectors 0 for which 
0_, = 0, it follows that ® is convex. Finally, the fact that is strictly increasing 
follows by differentiating twice and considering the limit where 6, — 0, while 
keeping 0_, constant and nonzero.‘ This establishes the desired facts regarding B. 


Step 6: Let (C, R, U) be a utility system. A vector 0 > 0 is a Nash equilibrium 
if and only if at least two components of 8 are nonzero, and there exists a nonzero 
vector d > O and a scalar ju > 0 such that 6, = td, for allr, aur d, = C,and 
the following conditions hold: 


dy) _ eee co ree 
U,(dr) (1 = =) = Ou) (1 *) + UP (LU) (=) , ifd,>0; (21.31) 
Ul(0) < (uw), ifd, =0. (21.32) 


In this case d, = D(pp(@), 9,), “= Daan 6,/C, and ®(u) = pp(@). Further, 
there exists a unique Nash equilibrium. The proof of this step is similar to the 


4 While the most direct argument uses twice differentiability of ®, it is possible to make a similar argument even 
if ® is only once differentiable, by arguing only in terms of increments of ®. 
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proof of Nash equilibrium characterization in Theorem 21.2; we omit the details, 
and refer the reader to the Notes section. 


Step 7: For any € > 0, there exists a utility systems (C, R, U) such that at any 
Nash equilibrium 0, the aggregate utility is no more than 3/4 + € of the maximal 
aggregate utility. Consider a utility system with the following properties. Let 
C = 1. Fix pw > 0, and let U;(d;) = Ad;, where A > ®(j). We will search for 
a solution to the Nash conditions (21.31) to (21.32) with market-clearing price 
D(u). 

We start by calculating d,; by assuming it is nonzero, and applying (21.31): 


— (A= &(u))C 

A — ®(u) + “w®'(W) 
In the spirit of the proof of Theorem 21.4, we will now choose users 2,..., R to 
have identical linear utility functions, with slopes less than A. As we will see, this 


will be possible if R is large enough. 
Formally, let d = (C — d,)/(R — 1), and (cf. (21.31)) define 


fin BME (UP'() — B(u))d 


d (21.33) 


21.34 
Gada ( ) 
Let U,(d,) = ad, forr = 2,..., R. Note that if 
C A-—® C 
” ( (14) (21.35) 


R~ A~ (pw) + wO(u)’ 


then a < A. This guarantees d; must be nonzero at any Nash equilibrium, so 
that the computation in (21.33) is valid. In turn, letting d, = d forr =2,..., R, 
this implies that (d,,..., dr) and wv are a valid solution to (21.31)-(21.32), when 
users have utility functions U,,..., Ur. 

Now consider the limiting ratio of Nash aggregate utility to maximal aggregate 
utility, as R — oo. We have d > 0, so a > (2). Furthermore, regardless of 
R a solution to SYSTEM is to allocate the entire resource to user 1, so the 
maximal aggregate utility is AC. Thus the limiting ratio of Nash aggregate utility 
to maximal aggregate utility becomes 


(A — ®(u)) (1 (A — ®(w)) ) (72 


A— Ow) + uP(p) | A — Ow) + uP'(u) A 


We now want to find the choices of A and yz which minimize this value. 

For notational simplicity, we define x = ®(w)/A, and U(jz) = wO'(y2)/P(y2). 
Note that given the convexity and invertibility of ®, we have (yj) > 1. Then 
(21.36) is equivalent to 


) . (21.36) 


(la? 
F(x; ph) = +x. (21.37) 
1+ (Y(u) — Lx 
It is straightforward to establish that the preceding expression is strictly convex 
in x for fixed w. Let G(Y(2)) denote the minimal value of F(x; 2) for x € (0, 1); 
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= 


0.75 4 


G(¥) 0.54 


0.2579 


0 10 20 30 40 50 
Figure 21.1. The function G(W) defined in (21.38). Note that G(W) is strictly decreasing, with 


G(1) = 3/4. 


by differentiating, it follows that G(W) is defined for YW > 1 according to 


3 
G(W) = (21.38) 
QW? —3w/v + Jv Sinead 


(YW — 1p Vv 


The function G is plotted in Figure 21.1. It is straightforward to verify that 
G(W) is continuous and strictly decreasing for YW > 1 so that the worst-case 
example is given by finding yz > 0 such that W(x) is maximized. Furthermore, it 
is straightforward to check that G(W) < 3/4, establishing the required claim. 


Step 8: For any mechanism other than the proportional allocation mechanism, 
the worst-case efficiency is strictly lower than 3/4. For the proportional alloca- 
tion mechanism, we have W(jz) = 1, and we have already established that the 
efficiency p is exactly 3/4. On the other hand, it is straightforward to check that if 
B(p) is nonlinear, then the maximal value of Y(,z) in the preceding step is strictly 
greater than 1; and in this case G(W(,)) is strictly less than 3/4. Thus there exists 
a game with efficiency ratio strictly lower than 3/4 for such a mechanism. This 
completes the proof. 


We make several comments regarding the proof. First, notice that every mechanism 
in the described class allocates in proportion to the bids of the players; in this sense all 
mechanisms in D are “proportional allocation mechanisms.” However, the efficiency 
loss is minimized exactly when this mechanism charges each user exactly their bid. 
Second, it is possible to show that the bound constructed in Steps 7-8 of the proof is 
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in fact a tight bound on the price of anarchy of the mechanisms under consideration; 
it is possible to reformulate this bound so that it depends only on the elasticity of 
the function B(p), i.e., the quantity inf,.9 pB’(p)/B(p). (This is not surprising, since 
Ww) is the elasticity of the function ®, which is the inverse of B.) It is surprising 
that the price of anarchy of a general class of such mechanisms can be reduced to this 
parsimonious calculation. 

Finally, we note one potentially undesirable feature of the family of market-clearing 
mechanisms considered: the payoff to user r is defined as —oo when the composite 
strategy vector is 8 = 0 (cf. (21.27)). This definition is required because when the 
composite strategy vector is 8 = 0, a market-clearing price may not exist. One possible 
remedy is to restrict attention instead to mechanisms where D(p, 0) = 0 if 6 = 0, for 
all p > 0; in this case we can define pp(0) = 0 if 0 = 0, and let the payoff to user r be 
U,(0) if 6, = 0. This condition amounts to a “normalization” on the market-clearing 
mechanism. It is possible to show that this modification does not alter the conclusion 
of Theorem 21.10. 


21.4 The Vickrey—Clarke—Groves Approach 


The mechanisms we considered in the last section had several restrictions placed on 
them; chief among these are that (1) users are restricted to using “simple” strategy 
spaces and (2) the mechanism uses only a single price to clear the market. On the other 
hand, one could consider both generalizations where users are allowed to use more 
complex strategies, perhaps declaring their entire utility function to the market; and 
also, where price discrimination is allowed so that each user is charged a personalized 
per-unit price for the resource. 

The best known solution employing both these generalizations is the VCG approach 
to eliciting utility information (see Notes, and Chapter 9). Such mechanisms allow 
users to declare their entire utility functions, and then charge users individualized 
prices so that they have the incentive to truthfully declare their utilities. We review 
VCG mechanisms in Section 21.4.1. 

In this section we are interested in deciding whether the same outcome can be 
realized preserving restriction (1) above, but removing restriction (2): that is, can 
mechanisms with “simple” strategy spaces that employ price discrimination achieve 
full efficiency? In Section 21.4.2 we present an alternate class of mechanisms, inspired 
by the VCG class, in which users only submit scalar strategies to the mechanism; we 
call such mechanisms scalar strategy VCG (SSVCG) mechanisms. We show that these 
mechanisms have desirable efficiency properties. In particular, we establish existence 
of an efficient Nash equilibrium, and under an additional condition, we also establish 
that all Nash equilibria are efficient. 


21.4.1 VCG Mechanisms 


In the VCG class of mechanisms, the basic approach is to let the strategy space of 
each user r be the set U/ of possible utility functions, as defined in Assumption 1, and 
structure the payments made by each user so that the payoff of each user r has the same 
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form as the objective function in SYSTEM, (21.1). As VCG mechanisms have been 
introduced in Chapter 9, we only use this section to fix notation for our subsequent 
discussion. For each r, we use U, to denote the declared utility function of user r, and 
use U = (0, ..., Wp) to denote the vector of declared utilities. 

Suppose that user r receives an allocation d,, but has to make a payment f,; we use 
the notation ¢, to distinguish from the bid w, of Section 21.2. Then the payoff to user 
r is 


U,(d,) —t,. 


On the other hand, the social objective (21.1) can be written as 


Ur(d,) + Y | Us(ds). 
sAr 


Given a vector of declared utility functions U, a VCG mechanism chooses the allocation 
d(U) as an optimal solution to SYSTEM for the declared utility functions U. For 
simplicity, let V¥ = {d > 0: ye d, < C}; this is the feasible region for SYSTEM. Then 
for a VCG mechanism, we have 


d(0) € arg max \>G,(d,). (21.39) 


The payments are structured so that 


,U) = — } | Gs(d(O)) + h-(O_,). (21.40) 
sAr 


Here h, is an arbitrary function of the declared utilities of users other than r. In general, 
we note that mechanisms of this form do not use a single price to clear the market; i.e., 
the per-unit price paid by user r, t, (U) /d, (U), will not be the same for all users. (See 
also Exercise 21.3.) 

For our purposes, the interesting feature of the VCG mechanism is that there exists a 
dominant strategy equilibrium that elicits the true utility functions from the users, and 
in turn (because of the definition of d(U)) chooses an efficient allocation. (See Chapter 
9 for a formal statement of these results, where it is shown that the VCG mechanism is 
incentive compatible.) In the next section, we explore a class of mechanisms inspired 
by the VCG mechanisms, but with limited communication requirements. 


21.4.2 Scalar Strategy VCG Mechanisms 


We now consider a class of mechanisms where each user’s strategy is a submitted 
utility function (as in the VCG mechanisms) except that users are allowed only to 
choose from a given single parameter family of utility functions. One cannot expect 
such mechanisms to have efficient dominant strategy equilibria, and we will focus 
instead on the efficiency properties of the resulting Nash equilibria. 
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Formally, scalar strategy VCG (SSVCG) mechanisms allow users to choose from a 
given family of utility functions U (.; @), parameterized by 6 € (0, 00).> We make the 
following assumptions about this family. 


Assumption 2: 


(i) For every 0 > 0, the function U(:0):dtb U(d;0) belongs to // (i.e., it is concave, 
strictly increasing, continuous, and differentiable), and is also strictly concave. 
(ii) For every y € (0, 00) and d > 0, there exists a 6 > O such that U (d; dy=y! 


Given 6, the mechanism chooses d(@) such that 


d(0) = see UG) (21.41) 


Since U (.; 6,) is strictly concave for each r, the solution d(@) is uniquely defined. (Note 
the similarity between (21.39) and (21.41).) 
By analogy with the expression (21.40), the monetary payment by user r is 


t-(0) = — U (d,(0); 05) +h, (0_,). (21.42) 
sAr 


Here h, is a function that depends only on the strategies 0_, = (6,, 5 4 r) submitted by 
the users other than 7. While we do not advocate any particular choice of h,, a natural 
candidate is to define h,(0_,) = pees U (d,(0_,); 05), where vd(0_,.) is the aggregate 
utility maximizing allocation excluding user r. This leads to a natural scalar strategy 
analogue of the Clarke pivot mechanism (cf. Chapter 9). 

Given h,, the payoff to user r is 


P,(d,(0), t-(0)) = U-(d-(8)) + S°U (8); 6) — h,(0_,). 
sAr 


A strategy vector 6 is a Nash equilibrium if no user can profitably deviate through 
a unilateral deviation, i.e., if for all users r there holds: 


P,(d,(8), t,(8)) = P-(d,(6,, 0-,), t-(6/,0-,)), forall 6f>0. (21.43) 


We start with the following key lemma, proven using an argument analogous to the 
proof that truthtelling is a dominant strategy equilibrium of the VCG mechanism (see 
Chapter 9). 


wn 


Note that, by contrast with Section 21.3, the choice of bid @ by a user indexes a utility function, rather than 
a demand function. However, this is not particularly crucial: if a user with utility function U maximizes 
U(d) — pd (ie., the user acts as a price taker), the solution yields the demand function D(p) = (U’)~!(p). 
Up to additive constant, the utility function and demand function can be recovered from each other. Thus, 
equivalently, we could define SSVCG mechanisms where users submit demand functions from a parameterized 
class. We define our SSVCG mechanisms according to Assumption 2 to maintain consistency with the definition 
of VCG mechanisms in Section 21.4.1, as well as in Chapter 9. 

Since we do not assume differentiability with respect to 0, the only differentiation of U is with respect to the 
first coordinate d, and Ud ;9) will always stand for the derivative with respect to d. 


a 
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Lemma 21.11 = Then the vector 0 is a Nash equilibrium of the SSVCG mechanism 
if and only if for all r: 


d(6) ¢ arg max ee a (21.44) 


PROOF Fix auser r. Since 6, does not affect h,, from (21.43) user r will choose 
0, to maximize the following effective payoff: 


U,(d,(8)) + 9 U (ds(8); 8). (21.45) 
sAr 


The optimal value of the objective function in (21.44) is certainly an upper bound 
to user r’s effective payoff (21.45). Thus, given a vector 0, if (21.44) is satisfied 
for all users r, then (21.43) holds for all users r, and we conclude @ is a Nash 
equilibrium. 

Conversely, given a vector 6, suppose that (21.44) is not satisfied for some user 
r. We will show @ cannot be a Nash equilibrium. Since ¥V is compact, an optimal 
solution exists to the problem in (21.44) for user r; call this optimal solution d”*. 
The vector d* must satisfy the first-order optimality conditions (21.8)—(21.10), 
which only involve the first derivatives U/(d**) and (UU (d*; 0,), 5 #r). Suppose 
now that user r chooses 6, > 0 such that U (d*: 0!) = U/(d*). Then, d* also 
satisfies the optimality conditions for the problem (21.41). Since d(6/, 6_,) is the 
unique optimal solution to (21.41) when the strategy vector is (6/, 0_,), we must 
have d(6/, 0_,) = d*. Thus we have 


P,(d,(0), t-(9)) < U,(d*) + 9) U (dt; 5) + hy(O-+) 
sAr 
= U,(d,(6,,O-+)) + x U (d,(6;, 0_,); 0.) + h,(O_) 
sAr 
— P,(d,(8,, 0_,), t,(6,, 0_,)). 
(The first inequality follows by the assumption that (21.44) is not satisfied for 


user r.) We conclude that (21.43) is violated for user r, so 6 is not a Nash 
equilibrium. 


The following corollary states that there exists a Nash equilibrium which is efficient. 
Furthermore, at this efficient Nash equilibrium, all users truthfully reveal their utilities 
ina local sense: each user r chooses 6, so that the declared marginal utility U (d,(0); 0,) 
is equal to the true marginal utility U/(d,(0)). 


Corollary 21.12 = For any SSVCG mechanism, there exists an efficient Nash 
equilibrium @ defined as follows: Let d° be an optimal solution to SYSTEM. Each 
user r chooses 0, so that U (ds 36,)= ul(ds ). The resulting allocation satisfies 


d(6) = d°. 
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PROOF By Assumption 2, each user r can choose 6, so that U (as 36.) = 
MGR ). For this vector 0, it is clear that d(@) = d°, since the optimal solution to 
(21.41) is uniquely determined, and the optimality conditions for (21.41) involve 
only the first derivatives U (d, (0); 6,.). By the same argument it also follows that 
d° is an optimal solution in (21.44). Since d(@) = d°, we conclude that (21.44) 
is satisfied for all 7, and thus @ is a Nash equilibrium. 


We note that, as in classical VCG mechanisms, there can be additional, possibly 
inefficient, Nash equilibria, as the following example shows. 


Example 21.13 Consider a system with R identical users with strictly concave 
utility function U. Suppose that user 1 chooses 6, so that U(C ;6,) > U'(0), and 
every other user r chooses 6, so that U (0;6,) < U'(C). Since U’(C) < U'(0), 
it follows that (21.44) is satisfied for all users r. Thus this is a Nash equilibrium 
where the entire resource is allocated to user 1; however, the unique optimal 
solution to SYSTEM is symmetric, and allocates C/R units of the resource to each 
of the R users. 


The equilibrium in the preceding example involves a “bluff’’: user 1 declares such a 
high marginal utility at C that all other users concede. One way to preclude such equi- 
libria is to enforce an assumption that guarantees participation. The next proposition 
assumes that all users have infinite marginal utility at zero allocation; this guarantees 
that all Nash equilibria are efficient. 


Proposition 21.14 Suppose that U/(0) = ow for all r. Suppose that 0 is a Nash 
equilibrium. Then d(@) is an optimal solution to SYSTEM. 


PROOF Let d=d(@). The proof follows by noting that all users must have 
positive allocations at equilibrium if U/(0) = oo, from (21.44). Thus at equilib- 
rium, for all users r, s we have U/(d,) = U (ds: @;). But this in turn implies that 
U/(d,) = Uj}(d;) for all r, s, a sufficient condition for optimality for the problem 
SYSTEM. O 


Intuitively, for efficiency to hold, we need to have a number of actively “competing” 
users. In the previous result, this is guaranteed because every user will want strictly 
positive rate at any equilibrium. 

The results of this section demonstrate that by relaxing the assumption that the 
resource allocation mechanism must set a single price, we can in fact significantly 
improve upon the efficiency guarantee of Theorem 21.10. It is critical to note that this 
gain in efficiency occurs only at Nash equilibria. The classical VCG mechanisms are 
unique in that they guarantee efficient outcomes as dominant strategy equilibria; it is 
straightforward to check that the SSVCG mechanisms described in this section will 
not have dominant strategy equilibria in general—e.g., the “bluff” example above is 
one such case. 
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21.5 Chapter Summary and Further Directions 


This chapter considered the allocation of a single resource of fixed supply among 
multiple strategic users. We evaluated a variety of market mechanisms through 
Nash equilibria of the resulting resource allocation game. Our key insights are the 
following: 


(i) 


(ii) 


(iii) 


A simple proportional allocation mechanism, where each user receives a share of 
the resource in proportion to their bid, ensures full efficiency when users are price 
takers, and exhibits no worse than a 25% efficiency loss when users are price 
anticipators. 

In a natural class of mechanisms where users choose one-dimensional strategies, and 
the market sets a single price, the proportional allocation mechanism minimizes the 
worst-case efficiency loss when users are price anticipating; i.e., the best possible 
guarantee here is 75% of maximal aggregate utility. 

This guarantee can be improved if the mechanism is allowed to set one price per 
user. Using an adapted version of the VCG class of mechanisms, we can construct 
mechanisms that ensure fully efficient Nash equilibria. 


Our investigation also reveals several further directions open for future research, 
including the following: 


(i) 


(ii) 


(iii) 


For the proportional allocation mechanism, we have proven a bound on the price of 
anarchy that shows that the ratio of the Nash equilibrium aggregate utility is no worse 
than 3/4 the maximum possible aggregate utility. For nonatomic selfish routing (cf. 
Chapter 18), a similar price of anarchy result holds: the ratio of Nash cost to the 
optimal cost is no worse than 4/3; furthermore, both proofs use the characterization of 
Nash equilibria as solutions to an optimization problem, with structure similar to the 
respective efficient optimization problems. These results are suggestive of perhaps a 
deeper generalization of price of anarchy for games with equilibria characterized as 
the solution to optimization problems. 

While Theorem 21.10 proves optimality of the proportional allocation mechanism in 
a reasonable class of mechanisms, the result depends critically on the assumption that 
all mechanisms in D yield concave payoffs when agents are price anticipating. Given 
that some type of quasiconcavity assumption is typically necessary on payoffs to 
even guarantee existence of Nash equilibria, one might informally expect the result of 
Theorem 21.10 to hold even if Condition 2 is removed in the definition of D. Whether 
this is in fact possible remains an open question. 

Our investigation shows, under reasonable assumptions, that with a single market- 
clearing price a 75% efficiency guarantee is possible, while with one price per user 
(the scalar strategy VCG approach), full efficiency is possible. This warrants further 
investigation: what is the exact trade-off between the number of prices and the effi- 
ciency guarantee possible? Furthermore, how does increasing the dimensionality of 
users’ strategy affect this efficiency guarantee? 
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21.6 Notes 


21.6.1 Section 20.2 


Much of the material in this section is based on Chapter 2 of Johari (2004) and the 
corresponding paper (Johari and Tsitsiklis, 2004). 

The mechanism discussed here was first studied in the context of communication 
networks by Kelly (1997). (See Chapter 22 for a discussion of the proportional al- 
location mechanism in congestion control algorithms for communication networks.) 
Theorem 21.1 is adapted from Kelly (1997), where it is proven in greater generality 
for an extension of the proportional allocation mechanism to a network context. This 
theorem is an extension of the classical first fundamental theorem of welfare economics; 
see Mas-Colell et al. (1995, Chapter 16), for details. 

The first proof of uniqueness of Nash equilibrium for the proportional allocation 
mechanism was provided by La and Anantharam (2000). The most general result of 
existence and uniqueness, and the basis for the result in Theorem 21.2, is due to Hajek 
and Gopalakrishnan (2002); a less general result was proven by Maheswaran and Basar 
(2003). The explicit formulation of the problem GAME is given by Johari and Tsitsiklis 
(2004). 

The price of anarchy result of Theorem 21.4 is due to Johari and Tsitsiklis (2004). 
The original proof of this result uses a two-step approach: it is first shown that the worst 
case is achieved using linear utility functions, and then the efficiency loss calculation 
is solved directly as a mathematical programming problem. The proof based on the 
problem GAME presented here is due to Roughgarden (2006), who also successfully 
applies the same method to efficiency loss calculations in several other games. 


21.6.2 Section 20.3 


Much of the material in this section is based on Chapter 5 of Johari (2004) and Section 
4 of Johari and Tsitsiklis (2007). 

The most closely related result to this section is presented by Maheswaran and Basar 
(2004). In their result, they consider mechanisms where each user r chooses a bid w,, 
and the allocation is still made proportional to each player’s bid. However, rather 
than assuming that every player pays w, as in the standard proportional allocation 
mechanism, Maheswaran and Basar consider a class of mechanisms where the user 
pays c(w,), where c is a convex function. They show that in this class of mechanisms, 
the proportional allocation mechanism (i.e., a linear c) achieves the minimal worst-case 
efficiency loss when users are price anticipating. 

Our work is substantially different, because we do not postulate that the mechanism 
must use the proportional rule (21.29) in allocating the resource; rather, this emerges 
as a consequence of rather simple assumptions on our mechanisms. We note that other 
works on inefficiency of resource allocation mechanisms, including Maheswaran and 
Basar (2004) and Yang and Hajek (2004), also assume a priori that allocations are made 
in proportion to users’ bids.’ In this sense, our result lends a rigorous foundation to the 


7 A notable exception is Sanghavi and Hajek (2004), which assumes that users pay their bid, and then designs an 
allocation rule to minimize worst case efficiency loss. 
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intuition that the proportional allocation rule (21.29) is a natural choice to determine 
the allocation among users. 


21.6.3 Section 20.4 


This section is based on Section 5.2 of the paper by Johari and Tsitsiklis (2007). 
Simultaneously and independently, a nearly identical formulation was developed by 
Yang and Hajek (2007). It is worth noting that Yang and Hajek and Maheswaran and 
Basar had earlier presented a resource allocation mechanism where users receive an 
allocation in proportion to their bids, but prices are chosen on an individualized basis 
(Maheswaran and Basar, 2004; Yang and Hajek, 2004); this mechanism can be seen to 
be a special case of the SSVCG mechanisms (Johari and Tsitsiklis, 2007). 

Subsequent to the above work, several papers have presented related constructions of 
mechanisms that use limited communication yet achieve fully efficient Nash equilibria. 
Building on earlier work by Semret (1999), Dimakis et al. establish that a VCG-like 
mechanism where agents submit a pair (price and quantity requested) can achieve fully 
efficient equilibrium for a related resource allocation game (Dimakis et al., 2006). 
Stoenescu and Ledyard consider the problem of resource allocation by building on the 
notion of minimal message spaces addressed in earlier literature on mechanism design, 
and build a class of efficient mechanisms with scalar strategy spaces (Stoenescu and 
Ledyard, 2006). 

The latter work of Stoenescu and Ledyard recalls perhaps the most related reference 
(and most seminal) in this area by Reiter and Reichelstein (1988). Their paper calcu- 
lates the minimal dimension of strategy space that would be necessary to achieve fully 
efficient Nash equilibria for a general class of economic models known as exchange 
economies. For our model, their bound evaluates to a strategy space per user of dimen- 
sion 1 + 2/(R(R — 1)), where R denotes the number of users. This is slightly higher 
than our result because Reiter and Reichelstein consider a much more general resource 
allocation problem. 
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Exercises 


21.1 This exercise, together with the next one, studies the efficiency loss properties of 
the mechanisms defined in Example 21.9, by following the proof of Theorem 21.4. 
Suppose that D(p, 0) = 6p—'/°, where c > 1. Suppose that given a utility system 
(C, R, U), a bid vector @ is a Nash equilibrium, and let the resulting allocation 
vector be d; i.e., d- = D(pp(6), 6,). 


(a) Verify the Nash equilibrium conditions (21.31)-(21.32). 
(b) Show that d is the unique solution to GAME, but where U, is defined as follows 
for each r: 


- a 1—2/C 
Hays 7 "Nia iz. EL 
Grid) = | (ao) ued (1.1) 


(Hint: rearrange the Nash equilibrium conditions (21.31)—(21.32).) 
(c) Show that U, satisfies Assumption 1. 


21.2 Fix D(p, 6) = @p~'“ and define U as in the previous exercise. Define B(D) accord- 
ing to (21.24), i.e., 


pint ne See 
Ue C>0 0<d,d<C U(d 


= 


WY 


(a) Show that p(D) > B(D). (Hint: first construct the variational inequality that 
identifies the optimality conditions for GAME, then argue as in the proof of 
Theorem 21.4.) 

(b) Show that B(D) > Cio). 
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(c) Using a construction analogous to the proof of Theorem 21.4, show that for 
any 6 there exists a utility system for which the ratio of Nash aggregate utility 
to the maximum aggregate utility is no more than G(c) +4. Conclude that 
p(D) = Glo). 


21.3 Show by example that a VCG mechanism does not necessarily charge each user 
the same per-unit price for the resource. 


PART FOUR 


Additional Topics 


CHAPTER 22 


Incentives and Pricing 
in Communications Networks 


Asuman Ozdaglar and R. Srikant 


Abstract 


In this chapter, we study two types of pricing mechanisms: one where the goal of the pricing scheme 
is to achieve some socially beneficial objective for the network and the other where prices are set 
by multiple competing service providers to maximize their revenues. For both cases, we present 
an overview of the mathematical models involved and the relevant optimization and game-theoretic 
techniques needed to study these models. We study the impact of different degrees of strategic inter- 
actions among users and between users and service providers on the network performance. We also 
relate our models and solutions to practical resource allocation mechanisms used in communication 
networks such as congestion control, routing, and scheduling. We conclude the chapter with a brief 
introduction to other game-theoretic topics in emerging networks. 


This chapter studies the problem of decentralized resource allocation among competing 
users in communication networks. The growth in the scale of communication networks 
and the newly emerging interactions between administrative domains and end users 
with different needs and quality of service requirements necessitate new approaches 
to the modeling and control of communication networks that recognize the difficulty 
of formulating and implementing centralized control protocols for resource allocation. 
The current research in this area has developed a range of such approaches. Central to 
most of these approaches is the modeling of end users and sometimes also of service 
providers as self-interested agents that make decentralized and selfish decisions. This 
research has two important implications: 


(i) The modeling of communication networks consisting of multiple selfish agents requires 
tools from game theory. 

(ii) In the absence of centralized control, the interaction of multiple selfish agents may lead 
to suboptimal resource allocation. 


This chapter will survey and develop existing work focusing on the role of prices, 
both used as control parameters in the network and set by service providers to in- 
crease their revenues. We will identify the different roles that prices may play in 
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communication networks depending on the degree of strategic interactions among users 
and between users and service providers, and explore their impact on network perfor- 
mance under different scenarios. We will also highlight how the study of large-scale 
communication networks raises new modeling challenges and develop the mathemati- 
cal tools that are commonly used in this analysis. 

The chapter is organized into three sections: the first two sections correspond to two 
conceptually different strategic settings, one where pricing is used to achieve some 
socially beneficial objective, and the other where prices are set by multiple service 
providers to maximize their revenues. The last section places the material in this 
chapter in the context of the broader literature, discusses some emerging applications 
of game theory to communication networks, and suggests a number of areas for future 
research. 


22.1 Large Networks — Competitive Models 


In this section, we present a brief overview of the literature on pricing to maximize 
system utility in a network with a large number of users. This line of research has had 
a tremendous impact on communication networks, having contributed both to a deeper 
understanding of network architectures and to the development of new protocols for 
more efficient use of resources in the Internet. We will end the section with some 
extensions to wireless networks. 

Consider a large network shared by many users, where the goal is to share the 
network resources in an optimal manner. It may be useful to think of the network as a 
graph with nodes and links. Each end user in the network is interested in transfering 
data between a source node and a destination node along a fixed route (or connection). 
We will use the terms “user,” “source,” and “connection” interchangeably. The nodes 
are interconnected by links. The network resources that we consider here are the link 
bandwidths. The bandwidth of a link is the maximum rate at which it can transmit data 
between the two nodes at either end of the link. We associate a utility function with 
each user in the network, and we will refer to a resource allocation scheme as being 
socially optimal if it maximizes the sum of utilities of all users in the network.! 

A network is modeled as a set of resources indexed by /, called links, with finite 
capacities c;. It is shared by a set of sources, indexed by r. Let U,(x,) be the utility 
of source r as a function of its rate x, (measured in packets per unit time). The utility 
function U, is assumed to be a strictly increasing, strictly concave function. Associated 
with each source is a route that is a collection of links in the network. Let R be a routing 
matrix whose (J, r) entry is 1 if source r’s route includes link / and is 0 otherwise. 
Since there is a one-to-one mapping between users and routes, we will use the same 
index to denote both a user and its route. For example, an index r can represent both 
user r and its route. Thus, the notation / € r indicates that link / is in the route of 
user r. 


' Tn the networking literature, social optimality and fairness are often used interchangeably. For other notions of 
fairness, see Cho and Goel (2006). 
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The resource allocation problem can be formulated as the following nonlinear opti- 
mization problem (Kelly, 1997): 


< 
max 2; U,(x;), Rx <c, (22.1) 


where x is the vector of source rates and c is the vector of link capacities. The constraint 
says that, at each link /, the aggregate source rate } >. Rix; does not exceed the capacity 
c,. If the utility functions are strictly concave, then the above optimization problem has 
a unique optimal solution, which we refer to as the socially optimal allocation. 

To solve this problem directly, we have to the know the utility functions and routes of 
all the sources in the network. In a large network such as the Internet, this information 
is not available centrally. One solution to this problem is to devise a mechanism such 
as the celebrated Vickrey—Clarke-Groves (VCG) mechanism to encourage users to 
reveal their utilities truthfully (see Chapters 5 and 9). However, such a mechanism is 
computationally complex to implement and would also require a central authority to 
solve an optimization problem to compute the prices. Instead, Kelly devised a simple 
mechanism capable of achieving the optimal allocation of resources in the presence 
of selfish users (see also Chapter 21). We will describe this scheme in the rest of this 
section and also show how the pricing motivation also leads to protocols for managing 
the Internet. Such a scheme was originally proposed in Kelly (1997), Kelly et al. (1998) 
and variations have been considered in Low and Lapsley (1999), Yaiche et al. (2000), 
and Kunniyur and Srikant (2002); for a more exhaustive survey of the work in this 
area, see Srikant (2004). 

Given the convexity of (22.1), a vector of rates * is optimal if there exists a vec- 
tor of Lagrange multipliers p satisfying the following Karush—Kuhn—Tucker (KKT) 
conditions: 


UG) = Dob Wr, (22.2) 
ller 
Pi (x: hy, — a) =O) ME, (22.3) 
riler 
aes. VA; (22.4) 
rler 
p,%>0. (22.5) 


Now, suppose that the network can compute p and charges each user r a price per bit 
of g, where @, is given by 


Gr = D0 Pr (22.6) 


r:ler 


In vector form, the above relationship can be written as @ = R? p. 

If the contribution of each user’s flow to the aggregate is negligible, we expect them 
to take aggregate quantities, in particular prices, as given in their decisions. In this 
case, we refer to the users as price takers. Under this assumption, user r’s optimization 
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problem can be expressed as 


max U(X) — GrXrp- (22.7) 


This expression is intuitive since it implies that each user is maximizing his utility minus 
the marginal cost of his flow, which consists of the sum of the Lagrange multiplier of 
each link traversed on its route. Clearly the solution to this problem is given by %, in 
(22.2). The equilibrium under this pricing scheme where each user is charged the sum 
of the Lagrange multipliers on its path coincides with the socially optimum outcome. 
There are two key assumptions for this implication: (1) Users are price takers, which 
is reasonable in the case of a large network such as the Internet and (2) prices are set 
equal to the Lagrange multipliers to implement the socially optimal allocation. This 
assumption is reasonable when prices are set by a network controller interested in the 
overall performance. We will discuss how the situation is different when prices are set 
by profit-maximizing service providers in the next section. 

For the above pricing scheme to work, the network has to be able to compute the 
Lagrange multipliers. There are two problems associated with this computation: 


P1 The network does not know the utility functions of the users. 
P2 Even if all the utility functions are known, there is no central authority that knows all 
the link capacities and the network topology to be able to solve (22.2)—(22.5). 


To address (P1)—(P2), we consider the following two-step mechanism. First, each user 
r announces a bid w,, which is the price per unit time that it is willing to pay. Then, 
the network decides to allocate rates to users according to the solution of the following 
optimization problem: 


it Rx <c. 22. 
max 2 og(x;), ac (22.8) 


The solution to the above optimization problem is called a weighted proportionally fair 
rate allocation. The KKT conditions for the optimization problem (22.8) are given by 


wy; 
== Sp. Ve (22.9) 
r riler 
pr (x: t= a) =0, V1, (22.10) 
r:ler 
Sowa, SV, (22.11) 
r:ler 
p,x* >0, (22.12) 


where x” is the solution to (22.8) and p”* is the associated vector of Lagrange multipliers. 
Furthermore, if the user can be induced to select w, = x*U/(x;), then x* = % and the 
network problem coincides with the social welfare maximization problem. 

To implement the mechanism described above, we have to first design a distributed 
algorithm to solve (22.8). The algorithm that we design is a dynamic algorithm where 
each link computes a price as a function of time according to a differential equation. The 
differential equation is designed so that, in steady state, the price of each link converges 
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to the Lagrange multiplier corresponding to the link’s resource constraint. To this end, 
suppose that each link computes a price according to the differential equation 


bp = (1 — en}, » (22.13) 


where p;(t) is the instantaneous link price at time t, y = )0,.;<, Xr is the total arrival 
rate at link J, and (a) is equal to max(a, 0) when b = 0 and is equal to a if b > 0. 
Note that the equilibrium of this differential equation is either y,) = c; or pj = 0 which 
satisfy one of the KKT conditions (22.10). Each user’s computer is hardwired with a 
program that computes rates according to the equation 


ee. (22.14) 
qr 
where q, is the price of route r and is given by g, = )°)./¢, Pl- 

To implement the above set of equations, it is assumed that the user r’s computer 
is equipped with a protocol to collect qg,, the price of its path, from the network. In 
networking parlance, equation (22.14) is called a congestion control algorithm since the 
user reacts to congestion indication in the form of q,. It is easy to see that if equations 
(22.13)—(22.14) converge, then their steady-state values satisfy (22.9)—(22.12) and thus, 
solve the optimization problem (22.8). Indeed the above set of equations converge under 
some mild assumptions. Let us suppose that the routing matrix R has full row rank, i.e., 
given a vector qg of route prices, the vector of link prices p is uniquely determined by 
the equation g = R? p. Since x* is unique, this assumption ensures that p* is unique. 
The following identity is useful: 


Gea p' Rea py. 
Now, consider the Lyapunov function 
1 *\T * 
Vip) = ARP) APP). 


Differentiating the Lyapunov function, we get 


dV . 
aa = Yar = PPO = dp, 
i 


S Yr = PPG = 1) 
1 
<(p- p*)'(y-0) 
=(p— p*)"(y-y*) + (p— p*)"Q* - 0) 
() *\T * 
<(p-pyy-y") 
= (p— p*)’ R(x —x*) =(q—-q*)'@ — x*) 
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where (a) follows from the fact if the projection Oe is not active, then the inequality 
holds as an equality and if the projection is active, the right-hand side of (a) is positive 
while the right-hand side of the equation above (a) is zero. Inequality (b) follows from 
the fact that either y* = c; or y* < c; and p; = 0. Finally, inequality (c) follows from 
the fact that 1/x, is a decreasing function. Thus, for a fixed set of bids {w,}, the system 
of equations (22.13)-(22.14) converges to the point (x*, p*). 

The above Lyapunov argument indicates that the congestion control algorithm is 
stable if w, is fixed. However, since the price that a user pays is a function of its bid w,, 
it is in the interest of the user to vary w,. How might the user vary w,? In general, we 
may expect users to act strategically and take into account the impact of their current 
bid on the future prices they will face. However, for our purposes here, let us suppose 
that they ignore these strategic aspects and behave myopically. In this case, they will 
simply maximize instantaneous net utility, the user’s optimization problem to choose 
wy is given by 


or equivalently as 
Tige—ae <3 Od 6 8 
The congestion control algorithm then becomes 
OG =O (22.15) 


The equilibrium point of the differential equation (22.13) is then given by (22.9)— 
(22.12) with w, replaced by xU/(x*). In this case, the x* = % where we recall that 
X is the optimal solution of (22.1) and satisfies (22.2)-(22.5). Thus, if the user is 
price-taking and myopic, then the users’ selfish objectives coincide with the social 
welfare objective of the system. To prove the convergence of (22.13)—(22.15), one 
can use the same Lyapunov function V(p) as before and proceed along the same 
lines. 

An interesting side benefit of the pricing scheme above is that it provides a natural 
decomposition of the network functionalities that is useful in designing the architecture 
of a communication network. The pricing model suggests that the resource allocation 
functionality should be decomposed into pieces implemented in different parts of the 
network: 


(i) Congestion control at the end users: The end users should be equipped with a 
protocol to adapt their rates in response to congestion feedback (route price) from the 
network. 

(ii) Congestion indication at the routers: The routers (the nodes in the graph) in the 
network should be equipped with a protocol to compute the price of each link that 
originates from the router. The price is an indicator of congestion on the link. 
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(iii) Congestion feedback from the network to the users: There must be a protocol that 
allows an end user to collect congestion information from the network. For example, 
each data packet could contain a field to collect the congestion information. This 
congestion field could be set to zero at the source and each router on the path can add 
its price to this field. When the data packet reaches the destination, the congestion 
field will contain the price of the route. The destination can then send a packet to the 
source to convey the route price information. 


The pricing framework introduced in this section can also be extended to incorporate 
other functionalities such as scheduling in a wireless network. We will briefly illustrate 
the extension to wireless networks, using a simple model; for a more general treatment, 
please see the survey (Lin et al., 2006) and the references within. 

In a wireline network, packets can be transferred on all links simultaneously. How- 
ever, in a wireless network, due to interference and collision, if a packet is scheduled 
on a link, other links in a neighborhood should be silent to avoid collisions and the 
resulting packet loss. We refer to a set of links that can be scheduled simultaneously 
as a schedule. Let M,, Mo, ..., My, be the set of possible schedules in a network. Let 
fi be the fraction of time that the network uses schedule M;. The resource constraints 
in the network can now be expressed as 


Pa ie ee (22.16) 


riler i:leM; 


1, (22.17) 


— 


Tete OD, (22.18) 


where c; is the number of packets that can be served by link / if it is scheduled. The goal 
is to find {x,} and { f;} to maximize >, U,(x,). The dual of the problem of maximizing 
>_, U-(%,) subject to the constraints (22.16)—(22.18) is 


D(p,r 
ma (p. A), 


where 


DOD Re Urb) Do Ya - DS fa 


r:ler i:leM; 
Er) 
i=1 
= max ) | Up) — D0 pd) (22.19) 
~ r 1 


riler 
+max ) | Pi > t= (>: fea ) ; (22.20) 
TE | BIE i=l 
It is not difficult to see that the dual objective for the wireline problem would also 


contain the term (22.19), while (22.20) is unique to the wireless problem. This suggests 
that the algorithm to compute x and p would be quite similar to the wireline case, but 
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additional computation is necessary to find the optimal value of f. Without using the 
Lagrange multiplier 2, note that (22.20) can be equivalently written as 


max LP > fia = max DS fi ye pic, = max ‘> Pic, 


Liat fS1,f20 i:leM; a oleM, leM; 


where the first equality is a simple interchange of the sums and the second equality 
follows from the fact that the optimization is a linear program and hence the solution 
will occur at a corner point. The last maximization problem can be interpreted as 
follows: pick the schedule that has the largest weighted price where the weights are the 
link capacities. The update equation at the source remains the same as before and is 
given by (22.15). It should be noted that while the network picks one of the schedules 
My, M2, ..., M, to solve (22.15) at each time instant, it turns out that the the long-run 
fraction of time that each schedule is the optimal solution to the utility maximization 
problem; the interested reader is referred to Lin et al. (2006) and references within. 
The price updates at the links are given by 


+ 


h=|n- >> fal - (22.21) 
PI 


Note that the above equation does not have to explicitly implemented; it is simply the 
queue length at link /, which will be automatically maintained by each link. Thus, the 
only additional implementation required in a wireless network is the computation of 
the maximum weighted price schedule. This is a computationally hard problem and, 
in practice, also requires a distributed implementation to be feasible. The problem of 
low complexity, distributed algorithms to approximate the maximum weighted price 
schedule is currently open. Assuming that such an algorithm exists, the stability of 
equations (22.15)—(22.21) can be established using a Lyapunov function approach 
similar to the wireline case. 


22.2 Pricing and Resource Allocation — Game 
Theoretic Models 


The previous section explored how prices can be used as control parameters for al- 
locating resources in communication networks. The analysis was non-game theoretic 
since users were assumed to be price takers and prices were set as control parameters 
to achieve the socially optimal allocation. While the framework with prices as control 
parameters is a useful starting point, it ignores a number of issues that are important 
for the analysis of resource allocation in large-scale communication networks. First, 
in a number of settings, where centralized control signals may be impractical or im- 
possible, end users may not face explicit prices. It is therefore important to understand 
the implications of selfish end-user behavior when the congestion they create and 
their use of scarce resources are not priced. Second, prices are often set by multiple 
service providers in control of their administrative domains with the objective of max- 
imizing their (long-run) revenues. In this section, we investigate the implications of 


PRICING AND RESOURCE ALLOCATION — GAME THEORETIC MODELS 579 


profit-maximizing pricing by multiple decentralized service providers. We turn to a 
discussion of other possible generalizations in the next section. 


22.2.1 Pricing and Efficiency with Congestion Externalities 


We now construct a model of resource allocation in a network with competing self- 
ish users and profit-maximizing service providers. The central question is whether 
the equilibrium prices that emerge in such a framework will approximate the prices 
implementing the socially optimal allocation discussed in the previous section. The 
class of models incorporating strategic behavior by service providers introduces new 
modeling and mathematical challenges. These models translate into game-theoretic 
competition models with negative congestion externalities,” whereby the pricing deci- 
sion of a service provider affects the level of traffic and thus the extent of congestion 
in other parts of the network. Nevertheless, tractable analysis of pricing decisions and 
routing patterns are possible under many network topologies. 

Models incorporating for-profit service providers have been previously investigated 
in Basar and Srikant (2002a, 2002b) and Acemoglu and Ozdaglar (2004). Here, we 
develop a general framework for the analysis of price competition among providers in 
a congested (and potentially capacitated) network building on Acemoglu and Ozdaglar 
(2006a, 2006b). We will see that despite its conceptual simplicity, this framework has 
rich implications. We illustrate some of these, for example, by showing the counterin- 
tuitive result that increasing competition among providers can reduce efficiency, which 
is different from the results of the most common models of competition in economics. 
Most importantly, we also show that it is possible to quantify the extent to which prices 
set by competing service providers approximate control role of prices discussed in 
the previous section. While generally service provider competition does not lead to an 
equilibrium replicating the system optimum, the extent of inefficiency resulting from 
price competition among service providers can often be bounded. 

We start with a simple example that shows the efficiency implications of competition 
between two for-profit service providers. 


Example 22.1 One unit of traffic will travel from an origin to a destination 
using either route 1 or route 2 (cf. Figure 22.1). The latency functions of the links, 
which represent the delay costs as a function of the total link flow, are given by 


ie. ese 
WE ees se er a 


It is straightforward to see that the efficient allocation [i.e., one that minimizes 
the total delay cost >>; Ji(x;)x;] is a = 2/3 and i = 1/3, while the (Wardrop) 
equilibrium allocation that equates delay on the two paths is x|’¥ ~ .73 > x? and 
xyVE = .27 < x}. The source of the inefficiency is that each unit of traffic does 
not internalize the greater increase in delay from travel on route 1, so there is too 
much use of this route relative to the efficient allocation. 


2 An externality arises when the actions of the player in a game affects the payoff of other players. 
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I(x) = x2/3 


1 unit of ——-> —_—> 
traffic => 


I(x) = (2/3)x 


Figure 22.1. A two link network with congestion-dependent latency functions. 


Now consider a monopolist controlling both routes and setting prices for travel 
to maximize its profits. We show below that in this case, the monopolist will set 
a price including a markup, which exactly internalizes the congestion externality. 
In other words, this markup is equivalent to the Pigovian tax that a social planner 
would set in order to induce decentralized traffic to choose the efficient allocation. 
Consequently, in this simple example, monopoly prices will be pM = (2/3)3 +k 
and pe = (2/3°) +k, for some constant k. The resulting traffic in the Wardrop 
equilibrium will be identical to the efficient allocation, i.e., x{"" = 2/3 and x}"" = 
1/3. 

Finally, consider a duopoly situation, where each route is controlled by a 
different profit-maximizing provider. In this case, it can be shown that equilibrium 
prices will take the form a = rae (1 + 15) [see Eq. (22.27) in Section 22.2.4], or 


more specifically, ge = 0.61 and a ~ 0.44. The resulting equilibrium traffic 


is xPE © 58 <x? and x9® = .42 > x}, which also differs from the efficient 
allocation. It is noteworthy that although the duopoly equilibrium is inefficient 
relative to the monopoly equilibrium, in the monopoly equilibrium k is chosen 
such that all of the consumer surplus is captured by the monopolist, while in the 


oligopoly equilibrium users may have positive consumer surplus." 


The intuition for the inefficiency of the duopoly relative to the monopoly is related to 
a new source of (differential) monopoly power for each duopolist, which they exploit 
by distorting the pattern of traffic: when provider 1, controlling route 1, charges a 
higher price, it realizes that this will push some traffic from route 1 to route 2, raising 
congestion on route 2. But this makes the traffic using route 1 become more “locked- 
in,” because their outside option, travel on route 2, has become worse. As a result, the 
optimal price that each duopolist charges will include an additional markup over the 
Pigovian markup. Since the two markups are generally different, they will distort the 
pattern of traffic away from the efficient allocation. 


22.2.2 Model 


We consider a network with J parallel links. Let Z = {1, ..., 1} denote the set of links. 
Let x; denote the total flow on link i, and x = [x,,..., x7] denote the vector of link 


3 Consumer surplus is the difference between users’ willingness to pay (reservation price) and effective costs, 
Pi + 1(x%;), and is thus different from the social surplus (which is the difference between users’ willingness to 
pay and latency cost, /;(x;), thus also takes into account producer surplus/profits). 
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flows. Each link in the network has a flow-dependent latency function /;(x;), which 
measures the delay as a function of the total flow on link i. We assume that the latency 
function /; is convex, nondecreasing, and continuously differentiable. The analysis can 
be extended to the case when the links are capacity-constrained as in the previous 
section; see Acemoglu and Ozdaglar (2006b). We also assume that /;(0) = 0 for all i.* 
We denote the price per unit flow (bandwidth) of link i by p;. Let p = [p1,..., py] 
denote the vector of prices. 

We are interested in the problem of routing d units of flow across the J links. 
We assume that this is the aggregate flow of many “small” users and thus adopt the 
Wardrop’s principle (see Wordrop, 1952) in characterizing the flow distribution in the 
network; i.e., the flows are routed along paths with minimum effective cost, defined as 
the sum of the latency at the given flow and the price of that path. We also assume that 
the users have a homogeneous reservation utility R and decide not to send their flow if 
the effective cost exceeds the reservation utility. 

More formally, for a given price vector p > 0, a vector x¥¥ € Ri‘ is a Wardrop 
equilibrium (WE) if 


l; (ae) Tr iPi= min {lj (x7) + pi}. Vi with x > 0, (22.22) 
j ; 
L(x") + pi <R,  Viwith x¥F > 0, 


I 
yx Sd, 


ieL 


with 3 cea” Sd if min j{J;(x"") + pj} < R. We denote the set of WE at a given 
p by W(p).° 

We next define the social problem and the social optimum, which is the routing (flow 
allocation) that would be chosen by a planner that has full information and full control 
over the network. A flow vector x° is a social optimum if it is an optimal solution of 
the social problem 


max So(R = LG). (22.23) 


Liersisd ieL 


Hence, the social optimum is the flow allocation that maximizes the social surplus, i.e., 
the difference between users’ willingness to pay and total latency. For two links, let x5 
be a social optimum with x° > 0 for i = 1, 2. Then it follows from the definition that 


h(x) +31; (99) = b (8) + fB(02). (22.24) 


This implies that the prices x°//(x>), i.e., the marginal congestion prices, can be used 
to decentralize the system optimum [cf. Eq. (22.22)]. 


4 This assumption is a good approximation to communication networks where queueing delays are more sub- 
stantial than propagation delays. We will talk about the efficiency implications of relaxing this assumption in 
different models. 

5 Tt is possible to account for additional constraints, such as capacity constraints on the links, by using a variational 
inequality formulation (see Acemoglu and Ozdaglar, 2006b; Correa et al., 2005). 
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For a given vector x > 0, we define the value of the objective function in the social 
problem, 


Sz) = > (R-L@)) x, (22.25) 
ie 
as the social surplus, i.e., the difference between users’ willingness to pay and the total 
latency. 


22.2.3 Monopoly Pricing and Equilibrium 


We first assume that a monopolist service provider owns the / links and charges a price 
of p; per unit bandwidth on link 7. The monopolist sets the prices to maximize his 
profit given by 


M(p, x)=) pix, 
ie 
where x € W(:). This defines a two-stage dynamic pricing-congestion game, where 
the monopolist sets prices anticipating the demand of users, and given the prices (i.e., 
in each subgame), users choose their flow vectors according to the WE. We define a 
vector (pME, xME) > 0 to be a Monopoly Equilibrium (ME) if xM® € W(p™®) and 


1b (Grail (Gye 3 Vp >0, Vx € W(p).6 


In Acemoglu and Ozdaglar (2006b), it was shown that price-setting by a monopolist 
internalizes the negative externality and achieves efficiency. In particular, a vector x is 
the flow vector at an ME if and only if it is a social optimum. This result was extended 
to a model that incorporates a general network topology in Huang et al. (2006). This 
is a significant departure from the existing performance results of selfish routing in 
the literature that assert that the efficiency losses with general latency functions can be 
arbitrarily bad. 


22.2.4 Oligopoly Pricing and Equilibrium 


We next assume that there are S service providers, denote the set of service providers 
by S, and assume that each service provider s € S owns a different subset Z, of the 
links. Service provider s charges a price p; per unit bandwidth on link i € Z,. Given 
the vector of prices of links owned by other service providers, p_, = [p;]i¢z,, the profit 
of service provider s is 


TI, (ps, P-s; Xx) =, ye PiXi> 
ieZ, 


for x € W(ps, DP-s)s where Ds = [ pi liez,. 
The objective of each service provider, like the monopolist in the previous section, 
is to maximize profits. Because their profits depend on the prices set by other service 


© Our definition of the ME is stronger than the standard subgame perfect Nash equilibrium concept for dynamic 
games. In Acemoglu and Ozdaglar (2006b), we show that the two solution concepts coincide for this game. 
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providers, each service provider forms conjectures about the actions of other service 
providers, as well as the behavior of users, which, we assume, they do according to 
the notion of (subgame perfect) Nash equilibrium. We refer to the game among service 
providers as the price competition game. We define a vector (p°®, x°F) > 0 to be a 
(pure strategy) Oligopoly Equilibrium (OE) if xO € W(pF, pF) and for all s € S, 
T1,(pO*, pS, x°°) = Uy(ps, poss x), V ps 20, Vx eW(ps, pS). (22.26) 


—s 


We refer to p° as the OE price. 

Analysis of the optimality conditions for the oligopoly problem [cf. (22.26)] allows 
us to characterize the OE prices (see Acemoglu and Ozdaglar, 2006b). In particular, 
let (pF, x°*) be an OE such that pO#x°® > 0 for some i € Z. Then, for all s ¢ S and 
i €T,, 


eae), if l’(x9") = 0 for some j ¢ Z,, 
pre = ; OE OE)! (OE Dyers 4 
: min; R—I; (x! ) , xpd: (xs ) 4+ —=+_};, otherwise. 
i#]s 77 0B 


7 (,OE 
ViQy 


The preceding characterization implies that in the two link case with minimum 
effective cost less than R, the OE prices satisfy 


Pt = xP (14 (298) + 4(09%)) (22.21) 


as claimed before. Intuitively, the price charged by an oligopolist consists of two terms: 
the first, xO//(x®), is equal to the marginal congestion price that a social planner would 
set [cf. Eq. (22.24)] because the service provider internalizes the further congestion 
caused by additional traffic. The second, Rae) reflects the markup that each 
service provider can charge users because of the negative congestion externality (as 


users leave its network, they increase congestion in the competitor network). 


22.2.5 Efficiency Analysis 


We investigate the efficiency properties of price competition games that have pure 
strategy equilibria. ’ Given a price competition game with latency functions {J;};ez, we 
define the efficiency metric at some oligopoly equilibrium flow x°® as the ratio of the 
social surplus in the oligopoly equilibrium to the surplus in the social optimum [cf. Eq. 
22.25 for the definition of the social surplus], i.e., the efficiency metric is given by 


S(xF) 
li}, xO) = 22.28 
rl}, 28) = SoS (22.28) 
where x° is a social optimum given the latency functions {/;};<z and R is the reservation 


utility. In other words, the efficiency metric is the ratio of the social surplus in an 
equilibrium relative to the surplus in the social optimum. Following the literature on 
the “price of anarchy,” in particular Koutsoupias and Papadimitriou (1999), we are 
interested in the worst-case performance of an oligopoly equilibrium, so we look for 


7 This set includes, but is substantially larger than, games with linear latency functions, see Acemoglu and 
Ozdaglar (2006a). 
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a lower bound on r/({l;}, x8) over all price competition games and all oligopoly 
equilibria. 

We next give an example of an J link network that has positive flows on all links at 
the OE and an efficiency metric of 5/6. 


Example 22.2 Consider an J link network where each link is owned by 
a different provider. Let the total flow be d = 1 and the reservation utility be 
R = 1. The latency functions are given by 


3 
h@)=0,  h@)=50-Vx, i=2..01 


The unique social optimum for this example is x* = [1, 0, ..., 0]. It can be seen 
that the flow allocation at the unique OE is xOE — [3, aa é ye 3 reste Hence, 


the efficiency metric for this example is r7({J;}, xOf) = 2. 


The next theorem establishes the main efficiency result. 


Theorem 22.3. Consider a general parallel link network with I > 2 links and 
S service providers, where provider s owns a set of links I, C L. Then, for all 
price competition games with pure strategy OE flow x°®, we have 


5 
ry({Ij}, xP) = 5, 
6 
and the bound is tight. 


A notable feature of Example 22.2 and this theorem is that the (tight) lower bound on 
inefficiency is independent of the number of links J and how these links are distributed 
across different oligopolists (i.e., of market structure). Thus arbitrarily large networks 
can feature as much inefficiency as small networks.® 


22.2.6 Extensions 


In this subsection, we extend the preceding analysis in two directions: First, we con- 
sider elastic traffic, which models applications that are tolerant of delay and can take 
advantage of even the minimal amounts of bandwidth (e.g., e-mail). We next focus on 
more general network topologies. 


Elastic Traffic 


To model elastic traffic, we assume that user preferences can be represented by an 
increasing, concave, and twice continuously differentiable aggregate utility function 
Uu(>>;<7 Xi), Which represents the amount of utility gained from sending a total amount 
of flow }°,.7 x; through the network. 


8 This result superficially contrasts with theorems in the economics literature that large oligopolistic markets 
approach competitive behavior. These theorems do not consider arbitrary large markets, but replicas of a given 
market structure. 
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We assume that at a price vector, the amount of flow and the distribution of flow 
across the links is given by the Wardrop’s principle (Wardrop, 1952). In particular, for 
a given price vector p > 0, a vector x* € R{ is a Wardrop equilibrium if 


Li(x7) + pi = “(D4). Vi with x; > 0, 


jel 


Li(xj) + pi 2 v(D4). Viel. 


jer 


We define the social optimum and the efficiency metric as in Eqs. (22.23) and (22.28), 
replacing R >); <7 x; (i.e., users’ willingness to pay) by u()0; <7 i). 

It can be shown that for elastic traffic with a general concave utility function, 
the efficiency metric can be arbitrarily close to 0 (see Ozdaglar, 2006). The two-stage 
game with multiple service providers and elastic traffic with a single user class was first 
analyzed by Hayrapetyan, Tardos and Wexler (2005). Using an additional assumption 
on the utility function (i.e., the utility function has a concave first derivative), their 
analysis provides nontight bounds on the efficiency loss.? Using mathematical tools 
similar to the analysis in Acemoglu and Ozdaglar (2006b), the recent work (Ozdaglar, 
2006) provides a tight bound on the efficiency loss of this game, as established in the 
following theorem. 


Theorem 22.4 Consider a parallel link network with I > 1 links, where each 
link is owned by a different provider. Assume that the derivative of the utility 
function, u' is a concave function. Then, for all price competition games with 
elastic traffic and pure strategy OE flow x°¥, we have 


and the bound is tight. 


Parallel-Serial Topologies 


Most communication networks cannot be represented by parallel link topologies, 
however. A given source-destination pair will typically transmit through multiple inter- 
connected subnetworks (or links), potentially operated by different service providers. 
Existing results on the parallel-link topology do not address how the cooperation 
and competition between service providers will impact efficiency in such general 
networks. 

Here, we take a step in this direction by considering the simplest network topol- 
ogy that allows for serial interconnection of multiple links/subnetworks, which is the 
parallel-serial topology (see Figure 22.2). It was shown in Acemoglu and Ozdaglar 
(2006a) that the efficiency losses resulting from competition are considerably higher 
with this topology. When a particular provider charges a higher price, it creates a nega- 
tive externality on other providers along the same path, because this higher price reduces 


° For example, they provide the nontight bound of 1/5.064 in general, and the bound of 1/3.125 for the case when 
latency without congestion is 0. 
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Figure 22.2. A network with serial and parallel links. 


the transmission that all the providers along this path receive. This is the equivalent 
of the double marginalization problem in economic models with multiple monopolies 
and is the source of the significant degradation in the efficiency performance of the 
network. 

In its most extreme form, the double marginalization problem leads to a type of 
“coordination failure,’ whereby all providers, expecting others to charge high prices, 
also charge prohibitively high prices, effectively killing all data transmission on a given 
path. We may expect such a pathological situation not to arise since firms should not 
coordinate on such an equilibrium (especially when other equilibria exist). For this 
reason, we focus on a stronger concept of equilibrium introduced by Harsanyi, the 
strict equilibrium. In strict OE, each service provider must play a strict best response 
to the pricing strategies of other service providers. We also focus our attention on 
equilibria in which all traffic is transmitted (otherwise, it can be shown that the double 
marginalization problem may cause entirely shutting down transmission, resulting in 
arbitrarily low efficiency, see Acemoglu and Ozdaglar, 2006a). 

The next theorem establishes the main efficiency result for this topology. 


Theorem 22.5 Consider a general I > 2 path network, with serial links on 
each path, where each link is owned by a different provider. Then, for all price 
competition games with strict OE flow x°®, we have 


and the bound is tight. 


Despite this positive result, it was shown in Acemoglu and Ozdaglar (2006a) that 
when the assumption /;(0) = 0 is relaxed, the efficiency loss of strict OE relative to the 
social optimum can be arbitrarily large. This suggests that unregulated competition in 
general communication networks may have considerable costs in terms of the efficiency 
of resource allocation and certain types of regulation may be necessary to make sure 
that service provider competition does not lead to significant degradation of network 
performance. 
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22.3 Alternative Pricing and Incentive Approaches 


The two approaches we have presented so far incorporate many of the important ideas 
in the role of prices and incentives in communication networks. Nevertheless, a variety 
of different approaches have also been developed in the literature, and the models 
presented in the previous two sections leave out several interesting aspects, which can 
be studied in future work. In this section, we first discuss the previous work on pricing 
in networks. We then mention several alternative approaches pursued in ongoing work. 
We conclude with a number of areas for future research. 


22.3.1 Previous Work on Pricing 


Despite the fact that current Internet access is based on a flat access charge, it has been 
recognized that the future of the Internet will involve multiple service classes, their use 
regulated by differentiated prices. The most natural approach to this problem involves 
the modeling of profit-maximizing service providers as developed in the previous 
section. Here we discuss some other aspects involved in the use of such prices. 


Pricing for Differentiated Services: Service differentiation brings in a clear need for 
offering incentives to users to encourage them to choose the service appropriate for 
their needs, hence preventing overutilization of network resources. Pricing mechanisms 
provide an efficient way to ensure QoS guarantees and regulate system usage. One of the 
key debates in network pricing area is whether charges should be based on fixed access 
prices or usage-based prices. While usage-based pricing has the potential to fulfill at 
least partially the role of a congestion control mechanism, there were criticisms in view 
of the apparent disadvantages of billing overheads and the resulting uncertainties in 
networking expenses (see DaSilva, 2000). 

A variety of pricing mechanisms have been proposed over the last decade. A well- 
known usage-based pricing proposal is by Mackie-Mason and Varian (1995), who 
proposed a “smart market” for resource allocation over a single link. In this scheme, 
users bid for transmission of each individual packet while the network provides service 
to packets whose bid exceeds a cutoff level determined by the marginal willingness-to- 
pay and marginal congestion costs. Users do not pay the price they bid, but rather the 
market- clearing price which is lower than the bids of all admitted packets. This mecha- 
nism resembles the Vickrey auction, and therefore provides users the correct incentives 
to reveal their true values in their bids. Odlyzko, in his seminal Paris Metro Pricing 
proposal (1990), suggested partitioning the network into several logical subnetworks. 
Users choose one of these logical networks for the transmission of their traffic, and 
this implicitly defines the service level; i.e., higher-priced networks will experience 
lower utilizations, and therefore will be able to provide a higher service level. Other 
proposed pricing schemes include edge-pricing, which focuses on locally computed 
charges based on expected values of congestion levels and routes; expected capacity 
pricing, in which users are charged according to the expected capacity the network 
provisions; and effective bandwidth pricing, which proposes the pricing of real-time 
traffic with QoS requirements, in terms of its “effective bandwidth”; see DaSilva (2000) 
for an overview of various pricing mechanisms. 
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First-Best Pricing: There is also a large theoretical literature in both communication 
networks and transportation networks area that study control mechanisms to induce 
efficient allocation of resources among competing users. The main focus is to use 
prices (or tolls) to induce flow patterns that optimize an overall system objective 
(also referred to as first-best pricing). It is well-known that marginal cost pricing, 1.€., 
charging individual users for the negative (congestion) externality they impose on other 
users, achieves the system optimal flows. A number of studies have also characterized 
the “toll set,” i.e., the set of all tolls that induce optimal flows, with the goal of choosing 
tolls from this set according to secondary criteria, e.g., minimizing the total amount of 
tolls or the number of tolled routes; see Hearn and Ramana (1998). Other related work 
focuses on models with heterogeneous users (i.e., users with different congestion-price 
sensitivities) and studies tolls that induce system optimal flows (see Cole et al., 2003; 
Fleischer et al., 2004). 


22.3.2 Current Research on Pricing and Incentive Models 


Many other game-theoretic models are useful in studying communication networks. 
Instead of providing a comprehensive survey, we now discuss a few models that are of 
significant practical relevance. 


Fixed Pricing and the Marginal User Principle: As mentioned in the previous 
subsection, for various practical reasons (some of which are perhaps simply legacy 
reasons), consumers are accustomed to paying a flat-fee (e.g., monthly) for their service. 
In markets with a flat fee, typically a service provider has some idea of the distribution 
of the user’s utility functions but not the utility function of each individual user. 

An important problem therefore is to determine the fixed flat fee that maximizes 
the service provider revenue and to understand the impact of such a pricing scheme 
on the allocation of resources. In Acemoglu et al. (2004), we show that in a wireless 
network the profit-maximizing fixed price is equal to the utility of the marginal user in 
the network, where the marginal user is defined as a user who is indifferent to joining 
the network. Since the price and the resource allocation scheme determine the marginal 
user, they have to be chosen jointly to maximize the network revenue and it has been 
shown in Acemoglu et al. (2004) that such a resource allocation algorithm and price can 
be computed by the service provider under certain assumptions on the utility functions. 


Incentives for Cooperation in P2P Networks: It is estimated that nearly half the 
traffic in today’s Internet is due to peer-to-peer (P2P) networks. P2P networks are used 
to typically share large files among users. Some well-known examples of P2P networks 
are BitTorrent, Gnutella, KaZaa, etc. A P2P network is a collection of a large number 
of users who contribute some resources (typically, bandwidth, and memory) to not only 
download files of interest to themselves but to also store and transmit files that may be 
of interest to others. A P2P network has remarkable scaling properties compared to a 
Web server that stores many files that can be downloaded by users. A Web server has 
finite upload bandwidth and therefore, as more users join the network, the bandwidth 
per user has to decrease. On the other hand, in a P2P network since each user is a 
potential user as well as a server, as the number of users in the network increases, the 
capacity of the network also increases to keep up with the demand. In fact, simple 
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analytical models suggest that there is no loss of performance as the number of users 
increases in a BitTorrent-type network (Qiu and Srikant, 2004). However, such scaling 
benefits can be achieved only if users cooperate. For example, if all users are only 
willing to download but refuse to upload files, then the network capacity will not 
scale with the number of users. Networks such as BitTorrent have some simple built-in 
incentive mechanisms to combat such problems and these have been studied in Qiu and 
Srikant (2004). As P2P networks continue to proliferate, it becomes quite important to 
study incentive mechanisms for such networks. Such issues are studied elsewhere in 
this book. 


Incentives for Cooperation in Wireless Networks: Another form of networking that 
is expected to see tremendous growth in the near future is multihop wireless networks. 
In such networks, laptop computer or other mobile radio devices will communicate 
with each other in a multihop fashion without any infrastructure such as an access point 
or a base station. For such communication to be feasible, each radio must be willing 
to forward packets for other users in the network. While on the face of it, the problem 
appears to be similar to the case of P2P networks, there are some key differences. In 
a wireless network, since the communication medium is shared, it is possible for a 
wireless node (say node A) to hear whether a neighbor (call it node B) is being selfish 
or not. For example, if node A forwards a packet (destined for another node) to node 
B, then A can listen to see if B forwarded the packet or not. However, if another 
neighbor of A (say, node C) transmits at the same time as node B, then A will not 
hear B’s transmission and thus, may erroneously assume that B is a selfish user. This 
is similar to a prisoner’s dilemma model with noisy observations of the players’ true 
actions (Piccione, 2002) and has been studied in He et al. (2004) and Mahajan et al. 
(2005) in a non-game-theoretic setting and in Milan et al. (2006) using game theory. 
However, the models used for the analysis of cooperation in multihop radio networks 
are currently quite simplistic and ignore the topological structure of the network. It is 
an open problem to develop more detailed models of the network and medium-access 
protocols, and to study the game-theoretic interactions for these more realistic models. 


22.3.3 Areas for Future Research 


The models presented so far highlight a number of fruitful areas for future research. 
These include but are not limited to the following topics. 


Incentive-compatible Differentiated Pricing: As discussed above, a key role of prices 
in networks will be in allocating users with different requirements to differentiated 
services. If the service requirements and other characteristics of users were known 
by a central controller or service providers, this problem would be similar to those 
studied above. In practice, however, such information is not available and the market 
mechanism (i.e., the pricing scheme) has to ensure that individuals choose the services 
designed for them. This problem can be analyzed as a combination of the competition 
models developed above and the classical mechanism design approach. In particular, 
the celebrated Revelation Principle in the mechanism design theory (see Mas-Colell 
et al., 1995) implies that we can think of direct mechanisms in which individuals 
truthfully report their types, and are allocated services and charged prices accordingly. 
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The mathematical formulation then necessitates that a set of incentive-compatibility 
constraints that make truthful reporting optimal for each user is satisfied. The modeling 
challenge in this approach lies in combining the competition among service providers 
and the incentive-compatibility constraints. 


Capacity Investments: While the focus of the current literature has been in ensuring the 
efficiency of the allocation of existing network resources, an arguably more important 
problem is to ensure that the right amount and type of infrastructure investment and 
capacity are installed in newly emerging networks. The analysis of this set of problems 
requires (multi-stage) models in which service providers choose not only prices but 
also investment levels and capacities. 


Simple Pricing Rules: One potential criticism of economic approaches for resource al- 
location in networks is whether the complicated pricing schemes necessary for achiev- 
ing socially optimal or profit-maximizing allocations can be computed and imple- 
mented in real time. The question of whether simple pricing rules can approximate 
these objectives and the quantification of the extent of efficiency or profits from such 
simple rules constitute another area for future research. 
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CHAPTER 23 


Incentives in Peer-to-Peer 
Systems 


Moshe Babaioff, John Chuang, and Michal Feldman 


Abstract 


Peer-to-peer (p2p) systems support many diverse applications, ranging from file-sharing and dis- 
tributed computation to overlay routing in support of anonymity, resiliency, and scalable multimedia 
streaming. Yet, they all share the same basic premise of voluntary resource contribution by the partic- 
ipating peers. Thus, the proper design of incentives is essential to induce cooperative behavior by the 
peers. With the increasing prevalence of p2p systems, we have not only concrete evidence of strategic 
behavior in large-scale distributed systems but also a live laboratory to validate potential solutions 
with real user populations. In this chapter we consider theoretical and practical incentive mechanisms, 
based on reputation, barter, and currency, to facilitate peer cooperation, as well as mechanisms based 
on contracts to overcome the problem of hidden actions. 


23.1 Introduction 


The public release of Napster in June 1999 and Gnutella in March 2000 introduced 
the world to the disruptive power of peer-to-peer (p2p) networking. Tens of millions 
of individuals spread across the world could now self-organize and collaborate in 
the dissemination and sharing of music and other content, legal or otherwise. Yet, 
within 6 months of its public release, and long before individual users are threat- 
ened by copyright infringement lawsuits, the Gnutella network saw two thirds of its 
users free-riding, i.e., downloading files from the network without uploading any in 
return. 

Given the large-scale, high-turnover, and relative anonymity of the p2p file-sharing 
networks, most p2p transactions are one-shot interactions between strangers that will 
never meet again in the future. It is therefore unsurprising that cooperation is difficult 
to sustain in these networks. The problem is exacerbated by hidden action due to 
nondetectable defections, and by the ability of peers to create multiple identities at 
no cost. It quickly became clear to the p2p developers community that some form of 
incentives is needed to overcome this free-riding problem. 


593 


594 INCENTIVES IN PEER-TO-PEER SYSTEMS 


The subsequent generation of p2p file-sharing networks incorporated incentive 
mechanisms based on currency or reputation. For example, in Mojonation, peers earn 
mojos through contributions to others, and use the earned currency to redeem for ser- 
vice from others. In KaZaA, peers build up their reputation scores by uploading, and 
highly reputed peers receive preferential treatment in their downloads. 

The BitTorrent file-sharing system went beyond currency and reputation, and 
adopted an incentive mechanism based on barter. By partitioning large files such 
as movies and software binaries into small chunks, file-sharing using the BitTorrent 
protocol necessitates repeat interactions among peers, allowing cooperation to flourish 
based on direct reciprocity rather than indirect reciprocity. From a system perspective, 
there is no need to keep long-term state information, in the form of either reputation 
or currency. This simplifies the design and improves its robustness against attacks. 
Empirical studies found much lower levels of free-riding in BitTorrent communities. 
Yet, theoretical analysis has demonstrated that the BitTorrent protocol can still be 
manipulated by selfish peers in their favor. 

The issue of incentives in p2p systems goes far beyond free-riding in file-sharing 
networks. Grassroots contribution by autonomous peers are needed to sustain many 
networked systems, ranging from mobile ad hoc networks and community-based wire- 
less mesh networks, to application layer overlay networks that support anonymous 
communications and live video streaming. Even interdomain routing over the Internet 
requires the cooperation of competing network operators. 

The strategy space is also far richer than the binary choice of share/not-share in 
file-sharing networks. Peers make strategic decisions concerning the revelation of pri- 
vate information, such as local resource availability, workload, contribution cost, or 
willingness-to-pay. Peers decide on the amount of exerted effort, given the nonob- 
servability of their hidden actions. Peers may adjust their spatial engagement with 
the network through strategic network formation, and temporal engagement through 
strategic churning (arrivals and departures). Finally, peers may choose to manage their 
own identities and treat the identities of others differently given the availability of 
cheap pseudonyms. 

The increasing prevalence of p2p systems, coupled with the rich strategy space 
available to the peers, make the problem of p2p mechanism design a challenging 
and broadly relevant topic of study for algorithmic game theory. P2P systems offer 
a concrete example of strategic behavior in large-scale distributed systems, as well 
as a live laboratory to validate potential solutions with real user populations. In this 
chapter, we discuss some p2p incentive mechanisms based on reputation, barter, and 
currency, as well as mechanisms to overcome the problem of hidden actions. We refer 
readers to other chapters in this book on the related topics of distributed algorithmic 
mechanism design (Chapter 14), strategic network formation (Chapter 19), network 
pricing (Chapter 22), and reputation systems (Chapter 27). 


23.2 The p2p File-Sharing Game 


A p2p file-sharing system seeks to support efficient and scalable distribution of files 
by leveraging the upload bandwidth of the downloading peers. In a p2p file-sharing 
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Figure 23.1. The temporal evolution of strategy populations in a p2p file-sharing game. “Time” 
is the number of elapsed rounds. “Population” is the number of players using a strategy. 


system, a peer plays one of two roles. For certain interactions, he is a client who 
wishes to download a file, and derives benefit from a successful download. For other 
interactions, he is a server who is requested to upload part or all of a file, and if he 
agrees he may bear some cost in the form of bandwidth and CPU usage. In such a 
one-shot game, “free-riding” is a dominant strategy — a player will download when he 
is a client, and refuse to upload when he is a server. 

The interaction between players in a p2p file-sharing system has many characteristics 
of the Prisoner’s Dilemma (PD) game. In the single-shot PD game, players have a 
dominant strategy to defect, which leads to a socially undesirable equilibrium outcome 
known as the “tragedy of the commons.” In the Iterated Prisoner’s Dilemma game, 
cooperation can be sustained through direct reciprocity (e.g., using the Tit-for-Tat or 
TFT strategy) since a defection in the current round can lead to retaliation by the other 
player in a future round. This “shadow of the future” can similarly sustain cooperation 
in the p2p file-sharing game, where a peer may decide to upload a file to another peer 
with the expectation that he may wish to download a file from the other peer sometime 
in the future. 

Of course, there is no guarantee that two peers will engage in multiple transactions 
with each other in their lifetimes. Even if they do, there is no guarantee that they will do 
so with a proper reversal of client and server roles to facilitate reciprocity or retaliation. 
In a large dynamic population with random matching of players, the probability of 
repeat interactions between players may be too small to cast an effective “shadow of 
the future,” and free-riding might prevail. 

Figure 23.1, taken from a simulation study of a p2p file-sharing game (Feldman et al., 
2004), illustrates the inability of a reciprocative strategy to scale to large populations. 
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Starting with equal shares of players that (1) always defect, (2) always cooperate, and 
(3) play a reciprocative strategy (a generalization of TFT for interleaved interactions 
with multiple peers), the game proceeds in rounds where the size of the population 
that plays each strategy is proportional to its success in the previous round. We see 
in Figure 23.1(a) that with a relatively small population, the reciprocative strategy 
dominates the population after 1,000 rounds. However, the strategy does not scale to 
larger populations, as seen in Figure 23.1(b), since the interactions between pairs of 
players are not frequent enough to make the strategy effective against defectors. 

This suggests that strategies based on the notion of direct reciprocity may not fit the 
environment of p2p systems with random matching and large populations. One way to 
overcome this is to enforce repeated interactions with a small number of peers, as is 
done in BitTorrent (discussed in further detail in Section 23.4). This design works well 
for the sharing of large and popular files, e.g., movies and software binaries, since there 
are large numbers of peers who are concurrently interested in a file, and are willing to 
engage in repeated interactions to exchange file segments with one another. 

To support cooperation over multiple files and longer timescales, some form of 
information sharing among the peers may be needed. This marks a shift from direct 
reciprocity to indirect reciprocity. Reputation systems (discussed in Section 23.3) 
provide a means for a peer to condition his action against his opponent upon the 
opponent’s past actions, not just against the peer himself, but against other peers in the 
system. This way, a peer may choose to serve a file to another peer on the grounds that 
the latter had cooperated with other peers in earlier interactions. 

Because p2p systems are large, dynamic systems with high turnover rates, peers 
often interact with strangers with no prior history or reputation. It is therefore very 
important to think about how one deals with strangers. A tit-for-tat strategy that always 
cooperates with strangers may encourage newcomers to join the system, but it can be 
easily exploited by whitewashers who leave and rejoin the system with new identi- 
ties. The problem arises because a whitewasher is indistinguishable from a legitimate 
newcomer. Always defecting against strangers is robust against whitewashers, but it 
discourages newcomers and may also initiate unfavorable cycles of defection. It has 
been shown that cooperating with strangers with a fixed probability 0 < p < 1 is not 
robust against whitewashers. On the other hand, adapting the probability of cooperation 
with strangers to the frequency of past cooperation by strangers appears to be effective 
against whitewashers, at least for a sufficiently small turnover rate. 

In the next three sections, we will discuss incentive mechanisms for p2p systems 
based on reputation, barter, and currency. 


23.3 Reputation 


Reputation has an excellent track record at facilitating cooperation in very diverse 
settings, from evolutionary biology to online marketplaces like eBay. It is therefore 
unsurprising that many p2p systems have adopted some form of reputation scheme to 
reward good behavior and/or punish bad behavior by the peers. 

In general, a p2p reputation scheme is coupled with a service differentiation scheme. 
Contributing peers possess good reputations and receive good service from other peers, 
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while noncontributing peers possess bad reputations and receive poor service from oth- 
ers. For example, peers in the KaZaA file-sharing network build up their reputation 
scores by uploading files to others, and are rewarded with higher priority when down- 
loading files from others. Similar schemes have been proposed for p2p storage, p2p 
multicast, and mobile ad hoc networks. 

Used in conjunction with other security techniques, a p2p reputation scheme can 
also be used to identify, isolate, and avoid malicious peers in a system. For example, 
the Eigentrust algorithm computes global trust values of peers by aggregating local 
trust values based on the notion of transitive trust, similar to the PageRank algorithm. 
Peers that introduce inauthentic files into the system receive a low global trust value 
and will be shunned by others. The Credence system extends the notion of reputation 
from peers to objects. Reputation scores are maintained for individual objects in the 
p2p system. These techniques can be used to defend against pollution and poisoning 
attacks in p2p file-sharing networks. 

Reputation systems may be subject to a number of different attacks. Multiple col- 
luding peers may boost one another’s reputation scores by giving false praise, or punish 
a target peer by giving false accusations. The availability of cheap pseudonyms in p2p 
systems make reputation systems vulnerable to Sybil attacks and whitewashing attacks. 
In a Sybil attack, a single malicious peer generates multiple identities that collude with 
one another. In a whitewashing attack, a peer defects in every p2p transaction, but 
repeatedly leaves and rejoins the p2p system using newly created identities, so that it 
will never suffer the negative consequences of a bad reputation. 

A comprehensive treatment of the design and implementation of reputation systems 
is provided in a separate chapter of this book. So we will focus our attention to the 
use of reputation and service differentiation schemes in establishing cooperation in 
p2p systems. In particular, we will construct a minimalistic model of a p2p system 
(in Section 23.3.1) to explore its dynamics and resulting equilibria in the absence 
of any reputation scheme, and see (in Section 23.3.2) how a reputation and service 
differentiation scheme can improve the performance of the system. 


23.3.1 A Minimalist p2p Model 


Consider a population of rational peers with heterogeneous willingness to contribute 
resources to the system. Each peer i has a type 6;, reflecting his generosity or the 
maximum cost he is willing to incur in contribution. Each peer makes autonomous 
decisions whether to contribute or free-ride based on the relationship between the 
cost of contribution and her type. Since contributors have to carry the load of the 
system, the contribution cost can be modeled as inversely proportional to the fraction of 
contributors in the system. Thus, if at present a fraction x of the peers are contributing, 
the contribution cost is 1/x, and therefore the decision of a rational peer with type 
6; is: 
Contribute, if 9; > 1/x; 


Free-ride, otherwise. 


Even within this simple framework we can already see some interesting implica- 
tions. In this “free market” environment where no incentive mechanism is in place, 
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Figure 23.2. (a) The intersection points of the type distribution and cost curves represent 
two equilibria of the system. The curve 1/6 represents the contribution cost, and Pr(6 > 0) 
represents the generosity CDF, assuming 6; ~ U(0, 6,,). The higher equilibrium (contribution 
level x1) is stable. The point x =0 is an additional equilibrium of the system. (b) Under 
the service differentiation mechanism, the cost curve shifts from 1/6 to #8?) _ pee, 
Consequently, the attractor (x1) shifts upward. 


the contribution level x in equilibrium is determined as the intersection of the type 
distribution, x = Pr(6; > 0) with the curve x = 1/0. 

Figure 23.2 shows the equilibria when the generosity type is uniformly distributed 
between 0 and some maximal value @,,. There are three equilibria in this system. 
The first two are the intersection points of the type distribution curve and the cost 
curve. The third equilibrium is x = 0, which always exists. Consider the natural fix- 
point dynamics of the system, i.e., starting at some initial x, peers arrive at individual 
decisions, their aggregate decisions define a new x, which leads to a new aggre- 
gate decision, and so on. When the system is out of equilibrium, the direction in 
which the system moves depends on the relative heights of the two curves. If the 
cost curve is above the type distribution curve, contribution cost is higher than the 
fraction of users who are willing to contribute at this cost, so the fraction of con- 
tributors decreases. For example, in Figure 23.2, this happens for x < x2 or x > x1. 
Conversely, for x; <x < x2, the contribution cost is lower than the willingness to 
contribute, so contribution level increases. Therefore, x = x; and x = 0 are the two 
attractors of the fixpoint dynamics. As long as the initial x lies above the lower in- 
tersection point (x2), the process converges to the upper one (x). Otherwise, if the 
initial x is below the lower intersection point, or if there is no intersection; i.e., when 
there are too many selfish rascals around, then x converges to 0 and the system 
collapses. 

The contribution level of the system, x, is derived by solving the fixpoint equation: 
x = Prob(6; => 1/x). If we consider the case in which the generosity of the peers 
is rile distributed between 0 and 6,,, i.e., 0; ~ U(O, 6), then Prob(@; > 1/x) = 
ie Onte/ O40 


ir ——, and the fixpoint equation is x = 1 — —-. Thesolutions are x12 = 6 

The larger root x; is a stable equilibrium while x2 is not. 6, denotes the maximal 
willingness to contribute resources, and reflects the overall generosity level of the 
system. 
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Claim 23.1 The stable nonzero equilibrium contribution level (x) increases in 
On and converges to 1 as On goes to 00, but falls to zero when On < 4. 


So far we have been interested only in costs. To understand system performance, 
we need to consider system benefits as well. We assume that the benefit a peer receives 
from participation in the system (whether or not she contributes) is proportional to the 
contribution level in the system, and thus a function of the form ax for some constant 
a > 1. We concentrate on cases where a is large, in which x = 0 is socially inefficient. 

We define the performance of the system, Ws, as the difference between the total 
benefits received by all peers and the total contribution cost incurred by all peers (noting 
that free riders incur no costs). Normalizing network size to 1, for x > 0 we have 


Ws =ax —(1/x)x =ax — 1. 


According to the definition of system performance and Claim 23.1, even if participation 
can provide high benefits to the peers, the system will still collapse if the maximal 
generosity is low, since the system performance is limited by the low contribution 
level. In the next section, we see how a reputation and service differentiation scheme 
can overcome this problem. 


23.3.2 Reputation and Service Differentiation 


Now let us introduce an incentive mechanism based upon reputation and service dif- 
ferentiation. Consider a reputation system that can catch free riders with probability p, 
and a service differentiation policy where identified free riders are excluded from the 
system. An alternate interpretation is a reputation system that can perfectly distinguish 
free riders and contributors, used in conjunction with a service differentiation policy 
where free riders are penalized with a reduced level of service of 1 — p times that of a 
contributor. 

Degrading the performance of the free riders has two effects, both of which lead 
to a higher contribution level. First, since free riders get only a fraction 1 — p of the 
benefits, the load placed on the system decreases to x + (1 — x)(1 — p). Therefore, 
contribution cost becomes sna ene sa Second, the penalty introduces a threat, since 
peers who free ride know that they will receive reduced service or face the possibility 
of expulsion. 

Let Q, R, and T denote the individual benefit, reduced contribution cost, and threat, 
respectively. A contributor would realize a performance of Q — R = ax — HHO p) 
while a free rider would realize a performance of Q — T = ax — pax. Then, the 
new equilibrium contribution level becomes x = Prob(@; > R — T), and is derived by 
solving the fixpoint equation: x = Prob(6; > zie ee pax). 

With the reputation and service differentiation ‘mechanism in place, the system 
performance now becomes 


Ws(p) = x(Q— KR) +0 —x)(Q—-T) = (ax —Da+(d— x) — p)) 


Imposing a penalty on free riders, while increasing the contribution level, entails 
some social loss. The p2p system designer could set the value of p to achieve a target 
cooperation level. Note that if the penalty is set sufficiently high, the threat T will 
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exceed the contribution cost R, and peers will no longer have any reason to free ride. 
In this case, no penalty is actually imposed. With no free riders an optimal system 
performance of a — 1 will be achieved. 


Claim 23.2 Under the penalty mechanism, if p > 1/a, then there exists an 
equilibrium in which x = 1. 


This means that if the benefits of participating in the p2p system are high (q is large), 
either a service differentiation policy that imposes a small performance penalty on free 
riders or a mechanism that can catch and exclude free riders with a small probability 
is sufficient to induce a high level of cooperation (with any maximal generosity level). 
Otherwise, a more severe penalty or a finer sieve for catching free riders would be 
necessary. 


23.4 A Barter-Based System: BitTorrent 


BitTorrent is a popular p2p file-sharing system with incentives as an integral part of its 
design. It departs from earlier p2p file-sharing systems in that its incentive mechanism 
is based loosely on direct reciprocity rather than indirect reciprocity. 

In BitTorrent, a seeding peer divides a large file into small fixed size pieces, and 
provides different pieces to different peers, who in turn exchange pieces with one 
another. A peer can reconstruct the file once it has obtained all the pieces. This technique 
is known as swarming download or parallel download. To induce peers to upload 
their pieces, a peer’s download rate is influenced by his upload rate through a direct 
reciprocity or barter scheme. 

BitTorrent attempts to alleviate the problem of random matching in large populations 
(Figure 23.1(b) in Section 23.2) by enforcing repeated transactions among peers. When 
a peer initiates a file download, it is matched with a small set of around 40 peers who 
are also downloading or uploading pieces of the same file. The peer selects four or 
five peers out of the set to connect to as neighbors, and periodically updates the 
list of neighbors with those peers that provide the best download rates. Through an 
opportunistic unchoking mechanism, a peer occasionally selects a random peer from 
the set to upload to, with the hope of finding new peers that can provide better download 
rates than the current neighbors. 

With this design, BitTorrent peers engage in multiple interactions with a small 
number of peers for the duration of a file download period. For the exchange of large 
files such as movies and software binaries, the number of repeated interactions can be 
quite large, allowing cooperation to take hold through direct reciprocity. However, the 
BitTorrent barter scheme does not address cooperation beyond the file download period. 
As aresult, peers have no incentive to serve as a seeder, i.e., to continue uploading after 
their own download is complete. To overcome this problem, a number of BitTorrent 
communities employ some form of reputation scheme on top of the existing barter 
scheme, and exclude peers with low contribution levels. 

BitTorrent represents the state of the art in p2p file-sharing, and appears to be able 
to establish cooperative communities in practice. However, several theoretical and ex- 
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perimental studies have revealed flaws associated with its incentive scheme. Through 
the formalization of specification faithfulness, Shneidman et al. (2004) demonstrate 
that the BitTorrent protocol is vulnerable to a number of rational manipulations by a 
selfish peer, including (1) pretending to have a lower upload bandwidth while retaining 
relative order with respect to the upload rate of other peers, so as to reduce its upload 
rate without compromising its download rate; (2) pretending to be split into multiple 
nodes (Sybil attack) to increase its chance of being randomly selected for down- 
load; (3) replacing identities when it is beneficial to do so (whitewashing attack); and 
(4) uploading garbage data to boost its upload rate. Therefore, it remains an open 
question if and how BitTorrent (or any other p2p barter scheme) can be made robust 
against all forms of rational manipulations. 

The Fair, Optimal eXchange (FOX) protocol offers a different, theoretical approach 
to solving the free-riding problem in p2p file swarming systems. Assuming that all 
peers are homogeneous with a capacity to serve k requests in parallel, and seeks to 
minimize its download completion time, FOX runs a distributed, synchronized protocol 
based on a static structured k-ary tree to schedule the exchange of file blocks between 
peers. Optimal download completion times can be achieved by all peers if all peers 
comply with the protocol. 

FOX employs a “grim trigger” strategy to enforce compliance. When a peer finds 
out that its neighbor deviates from the protocol, it can trigger a “meltdown” of the 
entire system. This threat results in an equilibrium where all rational nodes execute the 
protocol as specified, since any deviation will lead to an infinite download completion 
time. However, the equilibrium is not a subgame perfect equilibrium, and the threat 
is not credible. The protocol has limited practicality since the system is vulnerable to 
meltdown caused by a single malicious or faulty node. 


23.5 Currency 


A p2p system can also employ a currency scheme to facilitate resource contributions 
by rational peers. Generally, peers would earn currency by contributing resources to 
the system, and spend the currency to obtain resources from the system. MojoNation 
and Karma are two examples of currency-based p2p systems. 

Golle et al. (2001) provide the first equilibrium analysis of a p2p payment system. 
In the model, each peer makes an independent decision regarding his download and 
upload amounts. If each peer is charged an amount proportional to the gap between his 
downloads and uploads, then a unique strict Nash equilibrium exists where all peers 
would maximize their upload and download amounts. 

A more recent work by Friedman et al. (2006) looks at the efficiency of a currency- 
based p2p system. First, it establishes the existence, for each fixed amount of money 
supply in the system, a nontrivial Nash equilibrium where all peers play a threshold 
strategy, given a large enough discount rate. When playing a threshold strategy, a peer 
will satisfy a request (and earn some money) if his current balance is less than some 
threshold value, and refuse to satisfy a request if his current balance is above the 
threshold. By comparing the efficiency of equilibria at different money supply levels, it 
is possible to determine the money supply level that maximizes efficiency for a system 
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of a given size. It is interesting to note that the effective money supply level can be 
controlled either via the explicit injection or removal of currency or via changing the 
price of servicing a request. This means that inflation can be used as a tool to maintain 
the efficiency of the system as it grows in size. 

Robustness against Sybil and whitewashing attacks is still an important requirement 
for currency-based p2p system design in general. For example, a currency system can 
still be vulnerable to the whitewashing attack if newcomers are endowed with a positive 
opening balance, or if the balance is allowed to become negative, even temporarily. 


23.6 Hidden Actions in p2p Systems 


As we mentioned in the Introduction, strategic behavior in p2p systems goes far beyond 
free-riding in file-sharing networks. Peers may make strategic decisions on the timing 
of their arrivals and departures from the network, in selecting which peers to connect 
to, on whether to truthfully report to the system private information such as costs and 
valuations, or engage in other ways of manipulating the system protocol or mechanism. 
In this section, we will consider the issue of hidden action in p2p systems — how peers 
may behave strategically when their actions are hidden from the rest of the network, 
and how currency-based incentive mechanisms could be devised to overcome this 
problem. 

Consider the case of p2p file-sharing. In addition to sharing files, the peers in file- 
sharing networks such as Gnutella and KaZaA are also expected to forward protocol 
messages to and from their neighbors. For example, when a peer receives a query 
message from one of its neighbors, it is expected to forward the message to its other 
neighbors, in addition to responding to the query if it is able to. However, the peer could 
strategically choose to drop the message or forward the message probabilistically, so 
as to reduce its message forwarding costs. In many systems, such an action is not 
easily observable, nor can a defecting node be readily identified, since messages are 
forwarded on a best-effort basis and the topology is continually changing as peers 
enter and leave the network. Clearly, such a system would cease to function if all peers 
strategically decide not to forward any messages. How can the querying node provide 
incentives for the other nodes to perform the message forwarding task? 

The problem of hidden action in message forwarding can be readily generalized to 
other peer-to-peer settings. For example, devices in mobile ad hoc networks (MANETs) 
strategically drop packets to conserve their constrained energy resources. Internet 
Service Providers (ISPs) commonly practise hot potato routing to avoid the cost of 
transporting packets over their own networks. Indeed, the problem of hidden action is 
hardly unique to networks, and has long been studied by economists as the problem 
of moral hazard in contexts ranging from insurance to labor contracts. In the next 
section, we will apply the principal-agent framework to analyze the efficiency loss due 
to hidden action, and the design of optimal contracts to induce effort by the agents. 


23.6.1 The Principal-Agent Model 


A principal employs a set of n agents, N. Each agenti € N has a set of possible actions 
A; = {0, 1}, and a cost (effort) c(a;) => 0 for each possible action a; € A;. The cost of 
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low effort is zero while the cost of high effort is c > 0, i.e., c(0) = 0 and c(1) =c. 
The actions of the agents collectively and probabilistically determine a “contractible” 
outcome, o € {0, 1}, where the outcomes 0 and 1 denote project failure and success, 
respectively. The principal’s valuation of a successful project is given by ascalar v > 0, 
while he gains no value from a project failure. The outcome is determined according 
to the project technology, or a success function t : A; x --- x A, — [0, 1], where 
t(a), ...,d,) denotes the probability of project success when agents adopt the action 
profile a = (aj,...,@,) € Ay X +--+ X Ayn =A. 

We identify a subclass of technologies that can be represented by read-once net- 
works. Read-once networks are given by a graph with two special nodes, a source and 
a sink, and each agent i controls a single edge. If an agent exerts low effort, he succeeds 
with probability y;, and if he exerts high effort, the success probability increases to 
5; > y;. The project succeeds if there is a successful source-sink path, where the tech- 
nology maps the individual successes and failures of agents (denoted by x; = 1 and 
x; = 0 respectively) into the probability of project success. Two natural examples are 
the “AND” and the “OR” technologies. We consider the case in which the technology is 
anonymous (symmetric in the agents) and is further determined by a single parameter 
y € (0, 1/2) that satisfies 1 — 6; = y; = y for alli. 


The “AND” technology f(x;,...,x,) is the logical conjunction of x; (f(x) = 
/\ien Xi). Thus the project succeeds if and only if all agents succeed in their tasks 
(shown graphically in Figure 23.3(a)). If m agents exert effort (}°; a; =m), then 
tay=y" "1-—y)". 

For example, packet forwarding in a mobile ad hoc network can be represented 
by the AND technology. Each edge on the path is controlled by a single agent who 
succeeds in forwarding the packet with probability y € (0, 5) if he exerts low effort 
(a; = 0), and with probability 1—y € G, 1) if he exerts high effort (a; = 1). The 
message is delivered to the final destination if and only if all the individual agents 
have succeeded in their single-hop deliveries. The sender can only observe whether the 
message has reached the destination. 


The “OR” technology (x1, ..., Xn) is the logical disjunction of x; (f(x) = Vj Xi): 
Thus the project succeeds if and only if at least one of the agents succeed in their tasks 
(shown graphically in Figure 23.3(b)). If m agents exert effort ()°; a4; =m), then 
Hall ye = ys 

For example, the practice of multipath routing (Ganesan et al., 2001; Xu and Rexford, 
2006), where a message is duplicated and sent over multiple paths to a single destina- 
tion, can be represented by the OR technology if each path is represented by a single 
agent.! Each agent succeeds in forwarding the message with probability y € (0, 5) 
if he exerts low effort (a; = 0), and with probability 1 — y € G, 1) if he exerts high 
effort (a; = 1). The project is considered a success if at least one of the messages is 
successfully delivered to the destination. 


' Query message forwarding in p2p file-sharing networks may be modeled by OR-of-AND technology since the 
messages may be forwarded multiple hops along multiple paths. 
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(a) AND technology (b) OR technology 


Figure 23.3. Graphical representations of (a) AND and (b) OR technologies. The project 
succeeds if there is a successful path from s to t. Each agent controls an edge and succeeds 
with probability y with no effort, and with probability 1 — y with effort. 


The principal may design enforceable contracts based on the observable outcome.” 


We impose the /imited liability constraint, thus negative payments to the agents (or 
fines paid by agents to the principal) are disallowed. A contract is thus a commitment 
to pay agent i an amount p; > 0 upon project success, and nothing upon project failure. 

Given this setting, the agents have been placed in a game, where the utility of agent 
i under the profile of actions a = (aj,...,@,) is given by u;(a) = p; - t(a) — c(aj). 
Following convention, we denote by a_; € A_; the vector of the actions of all agents 
excluding agent i, ie., a_; = (@1,...,Gj—1, Gi41,---,@). The principal’s problem 
is that of designing the contracts p; for each agent i, so as to maximize his own 
expected utility u(a, v) = t(a)-(v — )U;ey pi), Where the actions aj,...,d, are at 
Nash equilibrium. In the case of multiple Nash equilibria, the principal can choose a 
desired one and “suggest” it to the agents. While this is a standard assumption, in our 
setting it is further justified by the fact that the best Nash equilibrium is also a strong 
equilibrium (i.e., equilibrium in which no subgroup of agents can coordinate a joint 
deviation such that every member of the subgroup strictly improves his utility), and the 
unique strong equilibrium in many scenarios. 

As we wish to concentrate on motivating agents, rather than on the coordination 
between agents, we assume that more effort by an agent always leads to a higher 
probability of success. Formally, 


Wie N, Va_; € A_; t(1,a_;) > t(O, a_;) 


In addition, we assume that f(a) > 0 for any a € A. 


Definition 23.3. The marginal contribution of agent i, given a_; € A_; is 
A;(a_;) = t(1, a_;) — t(0, a_i) 


A;(a_;) is the increase in success probability due to agent i moving from no effort 
to effort, given the effort of the others. The best strategy of agent i can be easily 
determined as a function of the other agents’ effort levels, a_; € A_;, and his 
contract pj. 


2 An alternate approach is to maintain a trusted clearinghouse to whom agents report intermediate outcomes, and 
the challenge is to induce the agents to report truthfully (Zhong et al., 2003). 
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Claim 23.4 Given a profile of actions a_;, agent i’s best strategy is a; = 1 if 
Di = Seat and is a, = 0 if pi < Sea: (In the case of equality the agent is 
indifferent between the two alternatives.) 
As pi = iA if and only if u;(,a_;)= p;-tU,a_;)—c = p;-t(0,a_j) = 
u;(O, a_;), agent i’s best strategy in this case is to choose a; = 1. This allows us 
to specify the principal’s optimal contracts for inducing a given equilibrium. 


Claim 23.5 The best contracts for the principal that induce a € A as an equi- 
librium are p; = 0 for agent i who exerts no effort (a; = 0), and p; = MED for 
agent i who exerts effort (a; = 1). 

In this case, the expected utility of agent i who exerts effort is c - (ee — 1), 
and 0 for an agent who shirk. The principal’ s expected utility is given by u(a, v) = 


Weel map) He): 


If a; =1 in the induced equilibrium a, we say that the principal con- 
tracts with agent i. Note that the utility of the principal is lower than in the 
observable-actions case, as the payment to each agent is higher than the agent cost. In 
economic terms, the principal can only obtain the “second best” but not the “first best” 
solution under hidden-actions.° 

The principal’s goal is to determine the profile of actions a* € A, which gives the 
highest utility u(a, v) in equilibrium, given his valuation v. Choosing a € A corre- 
sponds to choosing a set S of agents that exert effort (S = {i|a; = 1}). The set of 
agents S*(v) that the principal contracts with in a* (S*(v) = {ila = 1}) is an optimal 
contract for the principal at value v. We will abuse notation and denote f(S) instead of 
t(a), when S is exactly the set of agents that exert effort ina € A. 

A natural yardstick by which to measure this decision is the observable-actions 
case. When the principal can observe the individual actions of each agent, it can induce 
effort with a payment p; = c; to each agent i. In this case the principal’s utility is 
exactly the social welfare, and so the principal will simply choose the profile a €¢ A 
that optimizes the social welfare or global efficiency, t(a) - v — )jjq,=1 ¢- The worst 
case ratio between the optimal principal’s utility in this observable-actions case and his 
optimal utility in the hidden-actions case can be termed the price of unaccountability. 

Given a technology f, recall that $*(v) denote the optimal contract in the hidden- 
actions case and let S* (v) denote an optimal contract in the observable-actions case, 


oa 
when the principal’s valuation is v. 


Definition 23.6 The price of unaccountability POU(t) of a technology t is 
defined as the worst ratio (over v) between the principal’s utility in the observable- 


3 In the case of “AND” technology where y; = 0 Vi, it is shown in Feldman et al. (2005) that the principal can 
obtain the first best. While it is shown for the case in which agents take sequential actions, the same qualitative 
results also apply to the case of simultaneous actions (as Aj(a_;) = t(1, a_;) the expected utility of each agent 
is 0). It is also shown that the principal achieves the first best either through direct contracts (i.e., the principal 
contracts with each agent directly) or through recursive contracts (i.e., each agent contracts with its subsequent 
agent). 
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actions case and the hidden-actions case: 
t(S3a(¥)) + ¥ — Diese (v © 
- 
t(S*(v))(v _ Dies) orion) 


POU(t) = Supy>o 


For example, in the packet forwarding example, the POU measures the worst mul- 
tiplicative loss incurred by the sender due to his inability to monitor the individual 
actions taken by the intermediate nodes. 


23.6.2 Results 


We wish to understand how the optimal set of contracted agents should be selected as 
a function of the principal’s valuation of project success. A basic observation is that 
the optimal contract weakly “improves” with an increase in the valuation v. 


Lemma 23.7 (Monotonicity lemma) For any technology t, in both the 
hidden- actions and the observable-actions cases, the expected utility of the 
principal at the optimal contracts, the success probability of the optimal con- 
tracts, and the expected payment of the optimal contract, are all monotonically 
nondecreasing with the valuation v. 


For technologies in which the success probability depends only on the number 
of agents that exert effort (e.g., anonymous AND and OR), the above implies that 
the number of contracted agents is a monotonically non-decreasing function of the 
valuation. We find that the AND and OR technologies have very different structures on 
the optimal contracts: AND has just a single transition, from 0 agents to n agents, while 
OR has all transitions. 


Theorem 23.8 For any anonymous AND technology with n agents and with 

y= =1-—4; € (0, 5) for alli: 

° there exists a valuation’ v, < 00 such that for any v < v, it is optimal to contract 
with no agent, for v > v, it is optimal to contract with all n agents, and for v = vx, 
both contracts (0 and n) are optimal. 


¢ the price of unaccountability is obtained at the transition point of the hidden- 
actions case, and is POU = (7 — ly"! +(1— 75) 

Notice that the POU is not bounded across the AND family of technologies (for 
various n, y) as POU — ow either if y > 0 (for any given n > 2) or n > o0 (for 
any fixed y € (0, 5)). 

This means that in the message forwarding example, the sender will induce either 
all or none of the agents to exert effort in forwarding a message. Moreover, the loss 
incurred by the sender due to his inability to monitor the individual actions may be 


4 y, is a function of n, y,c. 
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very large. This suggests a possible role for a network monitoring system, even if it is 
costly to implement. 
Next we consider the OR technology. 


Theorem 23.9 For any anonymous OR technology with n agents and with y = 

vi = 1-6; € (0, 3) for alli: 

© there exist finite positive values v, < V2 < +--+ <v, such that for any v where 
Ug < VU < Ug4y, contracting with exactly k agents is optimal. (For v < v,, no agent 
is contracted, for v > U,, all n agents are contracted, and for v = vx, the principal 
is indifferent between contracting with k — 1 or k agents.) 


¢ the POU for OR technology with any n,c and y € (0, 5) is upper bounded by 5/2. 


This means that in the multipath routing example, the sender may induce any 
number of paths to exert effort in forwarding the message, depending on his valuation 
of successful message delivery. Moreover, the loss incurred by the sender due to his 
inability to monitor individual actions is always bounded by a factor of 5/2. 

For general read-once networks, it is not sufficient to determine the number of 
contracted agents, but the actual set of contracted agents. It turns out that computing 
the optimal contract for any read-once network, is at least as hard as computing the 
success probability t(£) (the network reliability), which is known to be #P-hard 
(Provan and Ball, 1983). 


Theorem 23.10 The Optimal Contract Problem for Read-Once Networks is 
#P-hard (under Turing reductions). 


PROOF SKETCH We will show that an algorithm for this problem can be used to 
solve the network reliability problem. Given an instance of a network reliability 
problem < G, {fc}ccz > (where ¢, denotes e’s probability of success), we define 
an instance of the optimal contract problem as follows: first define a new graph 
G’, which is obtained by “And” ing G with a new player x, with y, very close to 5 
and 6, = 1 — y,. For the other edges, we let 6. = ¢ and y. = €,/2. By choosing 
Yx close enough to s, we can make sure that player x will enter the optimal 
contract only for very large values of v, after all other agents are contracted. The 
critical value of v, where player x enters the optimal contract of G’, can be found 
using the algorithm that supposedly finds the optimal contract. At this critical 
value, the principal is indifferent between the set E and E U {x}. Now, from the 
expression for this indifference (in terms of t(£) and A‘(E)), the value of t(E) is 
derived. 


A natural research problem is to characterize families of technologies whose optimal 
contracts can be computed in polynomial time. In addition, while there exists fully 
polynomial time approximation schemes (FPTAS) to various versions of the network 
reliability problem (Karger, 1995), it remains an open question how well one can 
approximate the optimal contract problem. 
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23.7 Conclusion 


The fundamental premise of peer-to-peer systems is that of voluntary contribution of 
resources by individual users. However, there is an inherent tension between individual 
rationality and collective welfare. Therefore, the design of p2p incentives is of both 
theoretical and practical interest. In this chapter, we have reviewed different classes 
of p2p incentive mechanisms based on reputation, barter, and currency. We saw that 
cooperation can be sustained through barter if the p2p system can enforce repeat 
transactions among peers. Otherwise, incentive mechanisms based on reputation or 
currency may be necessary to overcome the free-riding problem. We also discussed the 
problem of hidden actions in p2p systems, and illustrated the use of contracts to induce 
the desired behavior by the peers. 

Many challenges and open problems remain in the design and evaluation of p2p 
incentives, of which we highlight two. First, what is the range of possible rational 
manipulations against a p2p system that are either specific to, or independent of, the 
type of incentive mechanism in use? For example, we have seen that robustness against 
Sybil and whitewashing attacks are important design requirements for reputation-, 
barter-, and currency-based incentive mechanisms. Given a design, can we test its 
robustness against a comprehensive catalog of rational manipulations? Second, how 
should we relax the rationality assumption in the analysis and design of p2p systems, 
to account for heterogeneous populations of peers that may be perfectly rational, 
bounded rational, altruistic, malicious, and/or faulty? What would be the appropriate 
solution concepts for p2p systems, and for distributed systems more generally? This 
appears to call for cross-fertilization with both behavioral economics and computer 
security. 

The ease of deploying p2p systems has led to their flowering in a short period 
of time. Today, we have a large number of p2p systems of varying scales running 
real applications of great value to real users. This offers us a unique opportunity to 
validate, using empirical data taken from real users, different designs and theories on 
p2p incentives. With hope, this will advance the theory and practice of incentive design 
for both online and offline systems. 
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and Feldman and Chuang (2005). The minimalist p2p model in Section 23.3 is due to 
Feldman et al. (2006). 
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(2006a). 
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Exercises 


23.1 Consider the p2p model in Section 23.3.1. The generosity of the peers is now dis- 
tributed as follows: a fraction ¢ of the peers have their type 6; uniformly distributed 
between O and 6,,, a fraction (1 — @)/2 are of type 6; = 0, and the remaining 
(1 — $)/2 are of type 6; = @m. How would the resulting equilibrium be different 
from that of Claim 23.1? 


23.2 In the p2p model of Section 23.3.1, suppose that the system designer has full infor- 
mation on each peer’s type (i.e., generosity level), and could exclude peers based 
on their types (rather than based on their behavior, as suggested in Section 23.3.2). 
Let z denote the fraction of peers who are excluded from the system. Provide an 
explicit expression, as a function of 8, and z, for the stable equilibrium in the 
system under such an exclusion mechanism. Would it always (for any value of 6,,) 
be beneficial to exclude some nonzero fraction of the population? Explain. 


23.3 Provide a proof for Theorem 23.8. Hint: First show that at v,. the principal's utility 
when contracting with n agents is greater than that when contracting with 1 <i <n 
agents. Then, use the monotonicity lemma to show that there must be a single 
transition for any AND technology. Finally, compute the price of unaccountability. 


23.4 Provide a proof for Part 1 of Theorem 23.9, showing that for any OR technology 
there are n transitions. Hint: Let v,;41 (i € {0, ..., 7 — 1}) be the value of v for which 
the principal has the same utility from contracting with / agents and with i + 1 
agents. First show that vj,;41 < vj41,42 for any i € {0,..., — 2}. Then, show that 
the above is sufficient to prove the theorem. 

23.5 Prove or provide a counterexample to the following claim: For any technology, 
the number of transitions in the hidden-actions case is equal to the number of 
transitions in the observable-actions case. 
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23.6 A strategy profile a € A is a strong equilibrium (SE) if there does not exist any 
coalition TC N and a strategy profile aj € xjepA; such that for any i €T, 
uj(a!_,, ar) > u;(a). Prove that under the optimal payments that induce the optimal 
contract S* in Section 23.6.1, S* is a strong equilibrium. 


CHAPTER 24 


Cascading Behavior 
in Networks: Algorithmic 
and Economic Issues 


Jon Kleinberg 


Abstract 


The flow of information or influence through a large social network can be thought of as unfolding 
with the dynamics of an epidemic: as individuals become aware of new ideas, technologies, fads, 
rumors, or gossip, they have the potential to pass them on to their friends and colleagues, causing the 
resulting behavior to cascade through the network. 

We consider a collection of probabilistic and game-theoretic models for such phenomena proposed 
in the mathematical social sciences, as well as recent algorithmic work on the problem by computer 
scientists. Building on this, we discuss the implications of cascading behavior in a number of online 
settings, including word-of-mouth effects (also known as “viral marketing”) in the success of new 
products, and the influence of social networks in the growth of online communities. 


24.1 Introduction 


The process by which new ideas and new behaviors spread through a population has 
long been a fundamental question in the social sciences. New religious beliefs or polit- 
ical movements; shifts in society that lead to greater tolerance or greater polarization; 
the adoption of new technological, medical, or agricultural innovations; the sudden 
success of a new product; the rise to prominence of a celebrity or political candidate; 
the emergence of bubbles in financial markets and their subsequent implosion — these 
phenomena all share some important qualitative properties. They tend to begin on a 
small scale with a few “early adopters”; more and more people begin to adopt them 
as they observe their friends, neighbors, or colleagues doing so; and the resulting new 
behaviors may eventually spread through the population contagiously, from person to 
person, with the dynamics of an epidemic. 

People have long been aware of such processes at an anecdotal level; the systematic 
study of them developed, in the middle of the 20th century, into an area of sociology 
known as the diffusion of innovations. The initial research on this topic was empirical 
(see, e.g., Coleman et al., 1966; Rogers, 1995; Strang and Soule, 1998 for background), 
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but in the 1970s economists and mathematical sociologists such as Schelling (1978) and 
Granovetter (1978) began formulating basic mathematical models for the mechanisms 
by which ideas and behaviors diffuse through a population. There are several reasons 
to seek models that capture observed data on diffusion: in addition to helping us 
understand, at a fundamental level, how the spread of new ideas “works,” such models 
have the potential to help us predict the success or failure of new innovations in their 
early stages, and potentially to shape the underlying process so as to increase (or 
reduce) the chances of success. 

In this chapter, we discuss some of the basic models in this area, as well as suggesting 
some current applications to online information systems. While the overall topic is 
much too vast even to survey in a brief setting such as this, we hope to convey some 
of the game-theoretic and algorithmic grounding of the area, and to highlight some 
directions for future work. We also indicate some of the ways in which large-scale 
online communities provide rich data for observing social diffusion processes as they 
unfold, thus providing the opportunity to develop richer models. Further related work 
is discussed briefly in the Notes at the end of the chapter. 


24.2 A First Model: Networked Coordination Games 


One of the simplest models for social diffusion can be motivated by game-theoretic 
considerations. To set the stage for this, notice that many of the motivating scenarios 
considered above have the following general flavor: each individual v has certain 
friends, acquaintances, or colleagues, and the benefits to v of adopting the new behavior 
increase as more and more of these other individuals adopt it. In such a case, simple 
self-interest will dictate that v should adopt the new behavior once a sufficient fraction 
of v’s neighbors have done so. For example, many new technological, economic, or 
social practices become more valuable as the number of people using them increases: 
two organizations may find it easier to collaborate on a joint project if they are using 
compatible technologies; two people may find it easier to engage in social interaction 
—all else being equal — if their beliefs and opinions are similar. 


Defining the game. Specifically, here is a first model for such situations, based on 
work of Morris (2000) that in turn builds on earlier work by Blume (1993), Ellison 
(1993), and Young (1998). Consider a graph G = (V, E) in which the nodes are the 
individuals in the population, and there is an edge (v, w) if v and w are friends, or 
otherwise engaged in some kind of social interaction. Sociologists refer to such a graph 
as a social network, a structure in which the nodes are individuals or other social entities 
(such as organizations), and the edges represent some type of social tie. 

We will study a situation in which each node has a choice between two possible 
behaviors: the “old” behavior, labeled A, and the “new” behavior, labeled B. On each 
edge (uv, w), there is an incentive for v and w to have their behaviors match, which 
we model as the following coordination game parametrized by a real number q, with 
O<q<l. 


e Ifv and w both choose behavior A, they each receive a payoff of q. 
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e Ifv and w both choose behavior B, they each receive a payoff of 1 — q. 
e Ifv and w choose opposite behaviors, they each receive a payoff of 0. 


Of course, it is easy to imagine many possible generalizations of this simple game, and 
we will explore some of these in the next section as well as in the exercises at the end 
of the chapter. But for now, we will keep things deliberately simple. 

Node v is playing this game with each of its neighbors in G, and its overall payoff 
is simply the sum of the payoffs from these separate games. Notice how gq (specifically 
its relation to 1 — g) captures the extent to which the new behavior is preferable to the 
old behavior at a purely “local” level, taking into account only pairwise interactions. 

Suppose that the behaviors of all other nodes are fixed, and node v is trying to select 
a behavior for itself. If the degree of node v is d,, and dA of its neighbors have behavior 
A and d® have behavior B, then the payoff to v from choosing behavior A is gd‘ while 
the payoff from choosing behavior B is (1 — q)d®. A simple computation shows that 
v should adopt behavior B if d® > qd,, and behavior A if d? < qd,. (To handle ties, 
we will say that v adopts behavior B if d? = qd,.) In other words, q is a threshold: a 
node should adopt the new behavior if at least a q fraction of its neighbors have done 
so. Note that new behaviors for which g is small spread more easily — a node is more 
receptive to switching to a new behavior B when gq is small. 


Cascading behavior and the contagion threshold. We can now study a basic model 
of cascading behavior in G, simply assuming that each node repeatedly updates its 
choice of A or B in response to the current behaviors of its neighbors. Keeping the 
model as simple as possible, we assume that each node simultaneously updates its 
behavior in each of discrete time steps t = 1, 2,3, .... If S is the set of nodes initially 
adopting the new behavior B, we let h,(S) denote the set of nodes adopting B after 
one round of updating with threshold g; we let hi(s ) denote the result of applying h, 
to S a total of k times in succession — in other words, this is the set of nodes adopting 
B after k rounds of updating. Note that nodes may switch from A to B or from B 
to A, depending on what their neighbors are doing; it is not necessarily the case, for 
example, that S is a subset of h(S). 

One of the central questions in such a model is to determine when a small set of 
nodes initially adopting a new behavior can eventually convert all (or almost all) of the 
population. We formalize this as follows. First, we will assume that the node set of G 
is countably infinite, with each node having a finite number of neighbors. (Anything 
we refer to as a “graph” in this section will have this property.) We say that a node w 
is converted by a set S if, for some k, the node w belongs to hj(S) for all j > k. We 
say that a set S is contagious (with respect to h,) if every node is converted by S — that 
is, if a new behavior originating at S eventually spreads to the full set of nodes. 

Now, it is easier for a set S to be contagious when the threshold q is small, so the 
interesting question is how large a threshold we can have and still observe small sets 
spreading a new behavior to everyone. We therefore define the contagion threshold of 
the social network G to be the maximum q for which there exists a finite contagious 
set. Note that the contagion threshold is a property purely of the topology of G — a 
network with large contagion threshold enables even behaviors that spread sluggishly 
to potentially reach the full population. 
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An example and a question. Here is an example to make the definitions 
more concrete. Suppose that G is a (two-way) infinite path, with nodes labeled 
{...,—2, —1,0,1,2,...}, and there is a new behavior B with threshold g = 1/2. 
Now, first, suppose that the single node 0 initially adopts B. Then in time step ¢ = 1, 
nodes 1 and —1 will adopt B, but 0 (observing both 1 and —1 in their initial behaviors 
A) will switch to A. As a result, in time step t = 2, nodes 1 and —1 will switch back 
to behavior A, and the new behavior will have died out completely. 

On the other hand, suppose that the set $ = {—1, 0, 1} initially adopts B. Then in 
time step t = 1, these three nodes will stay with B, and nodes —2 and 2 will switch to 
B. More generally, in time step t = k, nodes {—k, —(k — 1),...,k — 1, k} will already 
be following behavior B, and nodes —(k + 1) and k + 1 will switch to B. Thus, every 
node is converted by S = {—1, 0, 1}, the set S is contagious, and hence the contagion 
threshold of G is at least gq = 1/2. (Note that it would in fact have been sufficient to 
start with the smaller set S’ = {0, 1}.) 

In fact, 1/2 is the contagion threshold of G: given any finite set S adopting a new 
behavior B with threshold g > 1/2, it is easy to see that B will never spread past the 
rightmost member of S. 

It is instructive to try this oneself on other graphs; if one does, it quickly becomes 
clear that while a number of simple graphs have contagion threshold 1/2, it is hard 
to find one with a contagion threshold strictly above 1/2. This suggests the following 
question: Does there exist a graph G with contagion threshold q > 1/2? We will 
shortly answer this question, after first resolving a useful technical issue in the model. 


Progressive vs. nonprogressive processes. Our model thus far has the property that 
as time progresses, nodes can switch from A to B or from B to A, depending on 
the states of their neighbors. Many behaviors that one may want to model, however, 
are progressive, in the sense that once a node switches from A to B, it remains with 
B in all subsequent time steps. (Consider, for example, a professional community 
in which the behavior is that of returning to graduate school to receive an advanced 
degree. For all intents and purposes, this is a progressive process.) It is worth con- 
sidering a variation on our model that incorporates this notion of monotonicity for 
two reasons. First, it is useful to be able to capture these types of settings; and sec- 
ond, it will turn out to yield useful ways of thinking about the nonprogressive case as 
well. 

We model the progressive contagion process as follows. As before, time moves in 
discrete steps t = 1,2,3,.... In step f, each node v currently following behavior A 
switches to B if at least a qg fraction of its neighbors is currently following B. Any 
node following behavior B continues to follow it in all subsequent time steps. Now, if 
S is the set of nodes initially adopting B, we let hy(S ) denote the set of nodes adopting 
B after one round of updating in this progressive process, and we let h(S ) denote the 
result of applying /, to S a total of i times in succession. We can then define the notion 
of converted and contagious with respect to hg exactly as we did for hg. 

With a progressive process, it seems intuitively that it should be easier to find finite 
contagious sets — after all, in the progressive process, one does not have to worry about 
early adopters switching back to the old behavior A and thereby killing the spread of 
B. In view of this intuition, it is perhaps a bit surprising that for any graph G, the 
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progressive and nonprogressive models have the same contagion threshold (Morris, 
2000). 


Theorem 24.1 For any graph G, there exists a finite contagious set with respect 
to hg if and only if there exists one with respect to hg. 


PROOF Clearly, if S is contagious with respect to hg, then it is also contagious 
with respect to h,. Hence the crux of the proof is the following: given a set S that 
is contagious with respect to he, we need to identify a set S’ that is contagious 
with respect to hg. 

Thus, let S be contagious with respect to h,. The main observation behind the 
proof is the following. Since all nodes of G have finite degree, there is a finite 
set S that consists of S together with every node that has a neighbor in S. Since 


ir, (S) eventually grows to include every node of G, there exists some £ such 


that hi (S ) contains S. We define T = hi (S ), and we claim that T is contagious 
with respect to h,, which will complete the proof. Thus, intuitively, we watch the 
nonprogressive process until it “engulfs” the set of initial adopters S, surrounding 
them with all their possible neighbors; this larger set is then a robust enough point 
that the process would spread even under the progressive rule from here on. 

So why is the set T contagious with respect to h,? This requires a bit of 
manipulation of the definitions of h, and h,, although the details are not that 
complicated. We first note the following fact, whose proof is by induction on j is 
left an exercise to the reader: 


For all X and all j, we have hi(X) =X Uh, (hl (X)). (24.1) 


In other words, to get he (X ), rather than applying ie to h(x )), we can instead 
apply h, and then add in X. 


For ease of notation, let S; denote ns ), and let 7; denote h’(T). (Recall also 
that T = S;.) Now, suppose j > £. Then by (24.1) above, we have 
S; = SUh,(Sj-1). (24.2) 


But since S$;_, includes T and hence all the neighbors of S, we have S C h,(S;_1). 
Hence the “S U” in (24.2) is superfluous, and we can write S$; = h,(S;_1). By 
induction, it now follows that for all 7 > &, we have 


hi-(T) = hi-"(S)) = Sj, 


and hence T is contagious with respect to hg. 


The contagion threshold is at most 1/2. We now return to our question: does there 
exist a graph G whose contagion threshold exceeds 1/2? Thanks to Theorem 24.1, 
this question has the same answer regardless of whether we consider the progressive 
or nonprogressive process, and it turns out that the analysis is very easy if we consider 
the progressive version. 

We now show that 1/2 is in fact an upper bound for all graphs (Morris, 2000). Which 
can be read as a general statement about contagion on networks: a behavior cannot 
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spread very far if it requires a strict majority of your friends to convince you to adopt 
it. 


Theorem 24.2. The contagion threshold of any graph G is at most 1/2. 


PROOF Letgq > 1/2, and let S be any finite subset of the nodes of G. We show 
that S is not contagious with respect to hg. 


Recall our notation that $; = hi(s ). For a set of nodes X, we let 6(X) denote 
the set of edges with one end in X and the other end not in X, and we let d(X) be 
the cardinality of 6(X). Since all nodes in G have finite degree, d(X) is a natural 
number for any finite set of nodes X. 

We now claim that for all 7 > 0 for which $;_,; ¢ S;, we have d(S;) < d(Sj-1). 
To see this, we account for the difference between the sets 5(S$;_1) and 6(S;) by 
allocating it over the separate contributions of the nodes in S$; — S;—1. For each 
node v in S$; — S;_1, its edges into $;_; contribute to 5(S$;_1) but not 6(S;), and 
its edges into V — S$; contribute to 6($;) but not 5($;_1). But since g > 1/2, and 
since v decided to switch to B in iteration j, it has strictly more edges into $j_; 
(i.e., the nodes that had already adopted B) than it has into V — S;. Summing 
these strict inequalities over all nodes v in S; — S;_1, we have d(S;) < d(S;_1). 

Finally, we argue that S is not contagious with respect to he Indeed, the 
sequence of numbers d(S), d(S,), d(S2), d(S3), ..., is strictly decreasing as long 
as the sets S$, S;, So,..., remain distinct from one another. But since d(S) is a 
natural number, and d(S;) > 0 for all j, there must be some value k for which the 
sets stop growing, and Sy; = Sz41 = Sgz2 = +--+ from then on. Since S; is finite 
for any j, the set S, in particular is finite, and hence S is not contagious. 


24.3 More General Models of Social Contagion 


Thus far, we have been considering a very simple model for cascading behavior in a 
social network: people switch to a new behavior when a certain threshold fraction of 
neighbors have already switched; but in our first model this threshold was the same for 
all nodes, and all neighbors had equal “weight” in raising a node toward its threshold. 
Clearly a more general model could capture a greater degree of heterogeneity in the 
population. 

It is useful to mention a few preliminary points here. In this section, we will consider 
graphs that may be either finite or infinite. Also, we will work with directed graphs, 
enabling us to encode the notion that for two nodes v and w, the influence of v on w 
may be different from the influence of w on v. (One can model symmetric relationships 
between a pair of nodes in this case by including edges in both directions between them.) 
Also, we will consider contagion processes that are progressive, in that a node never 
switches back from a new behavior to an old behavior; at this end of this section, we 
will discuss a way to encode nonprogressive processes by a reduction to the progressive 
case, though by a different means than we saw earlier. 
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A linear threshold models. In a first generalization of the model, we allow nodes to 
weigh the influences of their neighbors differently.! Furthermore, we assume that each 
node’s threshold — the fraction of neighbors required for it to adopt the new behavior 
—is chosen uniformly at random. Thus, in the Linear Threshold Model, we introduce 
the following two ingredients. 


¢ We have a nonnegative weight b,,, on each edge (w, v), indicating the influence that w 
exerts on v. We will require that >~,, EN(v) byw < 1, where N(v) denotes the set of nodes 
with edges to v. 

¢ Each node v chooses a threshold 6, uniformly at random from [0, 1]; this indicates the 
weighted fraction of v’s neighbors that must adopt the behavior before v does. 


Now, the (progressive) dynamics of the behavior operates just as it did in the previous 
section. Some set of nodes S starts out adopting the new behavior B; all other nodes start 
out adopting A. We will say that a node is active if it is following B, and accordingly 
will say that it has been activated when it switches from A to B. 

Time operates in discrete steps t = 1, 2,3, .... At a given time f, any inactive node 
v becomes active if its fraction of active neighbors exceeds its threshold: 


pe ea 


active we N(v) 


This in turn may cause other nodes to become active in subsequent time steps, as it 
did in the model of the previous section, leading to potentially cascading adoption of 
behavior B. 

Note how the different thresholds for nodes indicate different predispositions to 
adopt B — small @, indicates a more liberal approach toward adoption, while a node 
with large 0, waits until a greater fraction of its neighbors have already adopted. While 
we have motivated the model directly in terms of the thresholds, one can also easily 
derive it from a networked coordination game, in which nodes have different payoff 
matrices, and different “stakes” in the games with their various neighbors. 


A general threshold model. Of course, the Linear Threshold Model is still very 
simple, in that it assumes influences of neighbors are strictly additive. It would be nice 
to express the notion that an individual will adopt a behavior when, for example, two 
of her relatives and three of her coworkers do so — a rule that cannot be expressed as a 
simple weighted sum. 

To handle this richer type of model, we consider the following General Threshold 
Model. Each node v now has an arbitrary function g,(-) defined on subsets of its 
neighbor set N(v): for any set of neighbors X C N(v), there is a value g,(X) between 
0 and 1. We will assume here that this function is monotone in the sense that if X C Y, 
then 9,(X) < gv(Y).? 


' Given the directed nature of the graph, we adopt the following terminological convention: we say here that w 
is a neighbor of v if there is an edge (w, v), and we say that w is an outneighbor if there is an edge (v, w). 

2 An interesting issue, which has been the subject of qualitative investigation but much less theoretical modeling, 
is the question of nonmonotone influence — a node may be motivated to adopt a new behavior once a few friends 
have done so, but then motivated to abandon it once too many have done so (see e.g., Granovetter, 1978). 
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Now, the dynamics of adoption proceed just as in the Linear Threshold Model, but 
with g,(-) playing the role of the weighed sum. Specifically, each node v chooses a 
threshold 6, uniformly at random in [0, 1]; there is an initial set S of active nodes; 
and for time steps tf = 1,2,3,..., each v becomes active if its set of currently active 
neighbors satisfies g,(X) > 0,. 

This new model is extremely general — it encodes essentially any threshold rule in 
which influence increases (or remains constant) as more friends adopt. Moreover, the 
assumption that the threshold is selected uniformly at random (rather than from some 
other distribution) is essentially without loss of generality, since other distributions can 
be represented by appropriately modifying the function g,. We now consider one final 
class of models, and then discuss some of the intermediate classes of special cases that 
may hold particular interest. 


A cascade model. Thus far, we have formulated models for the spread of a behavior 
strictly in terms of node thresholds — as some people adopt the behavior, the thresholds 
of others are exceeded, they too adopt, and the process spreads. It is natural, however, 
to ask whether we can pose a different model based more directly on the notion that 
new behaviors are contagious: a probabilistic model in which you “catch” the behavior 
from your friends. It turns out not to be hard to do this, and moreover, the resulting 
model is equivalent to the General Threshold Model. 

We define the Cascade Model to incorporate these ideas as follows. Again, there is 
an initial active set S, but now the dynamics proceeds as follows: whenever there is an 
edge (u, v) such that u is active and v is not, the node u is given one chance to activate 
v. This activation succeeds with some probability that depends not just on u and v, but 
also on the set of nodes that have already tried and failed to activate v. If u succeeds, 
then v may now in turn try to activate some of its (currently inactive) outneighbors; if 
u fails, then u joins the set of nodes who have tried and failed to activate v. 

This model thus captures the notion of contagion more directly, and also allows us 
to incorporate the idea that a node’s receptiveness to influence depends on the past 
history of interactions with its neighbors. We make the model concrete as follows. In 
place of a function g,, each node v now has an incremental function that takes the form 
Py(u, X), where u is a neighbor of v and X is a set of neighbors of v not containing u. 
The value p,(u, X) is the probability that uw succeeds in activating v, given that the 
set X of neighbors has already tried and failed. For our purposes here, we will only 
consider functions p, that are order-independent: if a set of neighbors uw, u2,..., ux all 
try to influence v, then the overall probability of success (as determined by successive 
applications of p,) does not depend on the order in which they try. 

While the Cascade Model is syntactically different from the General Threshold 
Model, we now argue that the two are in fact equivalent: One can translate from a set of 
incremental functions p, to a set of threshold functions g,, and vice versa, so that the 
resulting processes produce the same distributions on outcomes (Kempe et al., 2005). 

We now describe the translations in both directions; further detail behind the proofs 
can be found in Kempe et al. (2005). First, suppose we are given an instance of the 
General Threshold Model with functions g,; we define corresponding functions p, as 
follows. If a set of nodes X has already tried and failed to activate v, then we know that 
v’s threshold 6, lies in the interval (g,(X), 1]; subject to this constraint, it is uniformly 
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distributed. In order for u to succeed after all the nodes in X have tried and failed, we 
must further have 0, < g,(X U {u}). Hence we should define the incremental function 


8v(X U {u}) — gv(X) 

Lex) 
Conversely, suppose that we have incremental functions p,. Then the probability v is 
not activated by a set of neighbors X = {u1, u2,..., ux} 1s f4d — p,(u;, X;-1)), 
where we write X;_,; = {u,,..., u;—1}. Note that order-independence is crucial here, 
to ensure that this quantity is independent of the way in which we label the elements 
of X. Hence we can define a threshold function g, by setting 


Plu, X) = 


k 
g(X) =1—] Jd — pou, Xi-0. 
i=l 

This completes the translations in both directions, and hence establishes the equivalence 
of the two models. 

Next we consider some special cases of the Cascade Model that will be of particular 
interest to us. (Given the equivalence to the General Threshold Model, these could also 
be written in that framework, though not always as simply.) 


(i) First, it is easy to encode the notion that v will deterministically activate once it has 
k active neighbors: we simply define p,(u, X) = Oif |X| Ak —1,and p,(u, X)= 1 
if |X| =k—-1. 

(ii) In contrast, the influence of a node’s neighbors exhibits diminishing returns if it 
attenuates as more and more people try and fail to influence it. Thus, we say that a 
set of incremental functions p, exhibits diminishing returns if p,(u, X) > p,(u, Y) 
whenever X C Y. 

(iii) A particularly simple special case that exhibits diminishing returns is the Independent 
Cascade Model, in which u’s influence on v is independent of the set of nodes that 
have already tried and failed: p,(u, X) = py, for some parameter p,, that depends 
only on uw and v. 


We will see that the contrast between (i) and (ii) above will emerge as a particularly 
important qualitative distinction: whether the influence of one’s neighbors in the social 
network incorporates some notion of “critical mass” (as in (i)), with a crucial number 
of adopters needed for successful influence; or whether the strength of influence simply 
decreases steadily (as in (ii)) as one is exposed more and more to the new behavior. 
In the next section, we will discuss an algorithmic problem whose computational 
complexity is strongly affected by this distinction; and following that, we will discuss 
some recent empirical studies that seek to identify the two sides of this dichotomy in 
online influence data. 

Before this, we briefly discuss a useful way of translating between the progressive 
and nonprogressive versions of these cascade processes. 


Progressive vs. nonprogressive processes (redux). The discussion in this section has 
been entirely in terms of progressive processes, where nodes switching from the old 
behavior A to the new behavior B never switch back. There is a useful construction 
that allows one to study the nonprogressive version of the process by translation to 
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a progressive one on a different graph (Kempe et al., 2003). As it is a very general 
construction, essentially independent of the particular influence rules being used, we 
describe it at this level of generality. 

Given a graph G on which we have a non-progressive process that may run for up to 
T steps, we create a larger graph G’ built from T copies of G, labeled G,, G2, ..., Gr. 
Now, let v‘’) be the copy of node v in the graph G;; we construct edges (u"'~!), uv) 
for each neighbor u of v. As a result, the neighbors of v“) in G’ are just the copies of 
v’s neighbors that “live” in the previous time-step. In this way, we can define the same 
influence rules on G’, node-by-node, that we had in G, and study the non-progressive 
process in G as a progressive process in G’: some copies of v in G’ will be an active, 
and other will not, reflecting precisely the time steps in which v was active in G. 


24.4 Finding Influential Sets of Nodes 


A number of current Internet applications concern domains in which cascading behavior 
is a crucial phenomenon, and where a better understanding of cascades could lead to 
important insights. One example is viral marketing, where a company tries to use word- 
of-mouth effects to market a product with a limited advertising budget, relying on the 
fact that early adopters may convince friends and colleagues to use the product, creating 
a large wave of adoptions. While word-of-mouth effects have a history in the area of 
marketing that long predates the Internet, viral marketing has become a particularly 
powerful force in online domains, given the ease with which information spreads, and 
the rich data on customer behavior that can be used to facilitate the process. 

A second example is the design of search tools to track news, blogs, and other 
forms of online discussion about current events. When news of an event first appears, 
it generally spreads rapidly through a network of both mainstream news sources and 
the larger population of bloggers — as news organizations and individuals learn of 
the event, they write their own commentary or versions of the story, and subsequent 
waves can occur as new developments take place. As with our first example, news was 
studied as a diffusion process long before the appearance of the Internet, but the fact 
that news sources are now online provides large-scale, time-resolved data for studying 
this diffusion, as well as the opportunity to build tools that can help people track the 
development of a news story in real time as it evolves. 

There are a number of interesting algorithmic questions related to these processes, 
and here we focus on a particular one, posed by Domingos and Richardson (2001) — the 
identification of influential sets of nodes. While this is a natural question in the context 
of both our examples above, we describe the problem in terms of the viral marketing 
framework, where it is particularly easy to express the underlying motivation. 


The most influential set of nodes. Suppose that we are a firm trying to market a 
new product, and we want to take advantage of word-of-mouth effects. One strategy 
would be as follows: we collect data on the social network interactions among our 
potential customers, we choose a set S of initial adopters, and we market the product 
directly to them. Then, assuming they adopt the product, we rely on their influence to 
generate a large cascade of adoptions, without our having to rely on any further direct 
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promotion of the product. The question of how one goes about inferring the social 
network interactions and building a cascade model for this purpose is an interesting 
and largely unexplored topic. Here, however, we assume that all this data is provided 
to us, and we focus on the algorithmic problem that comes next: how do we choose the 
set S? 

We can formulate this question concretely as follows. For any instance of the General 
Threshold or Cascade Models, there is a natural influence function f(-) defined as 
follows: for a set S of nodes, f(S) is the expected number of active nodes at the end 
of the process, assuming that S is the set of nodes that are initially active. (We will 
assume in this section that all graphs are finite, so the processes we are considering 
terminate in a number of steps that is bounded by the total number of nodes n.) From 
the marketer’s point of view, f(S) is the expected number of total sales if they get § 
to be the set of initial adopters. Now, given a budget k, how large can we make f(S) 
if we are allowed to choose a set S$ of k initial adopters? In other words, we wish to 
maximize f(S) over all sets S of size k. 

This turns out to be a hard computational problem. First, for almost any special case 
of the models in the previous section — even very simple special cases — it is NP-hard 
to find the optimal set S. Moreover, one can construct instances of the model for which 
it is NP-hard even to approximate the optimal value of f(S) to within a factor of n'!~* 
for any ¢ > 0, where again n is the number of nodes (Kempe et al., 2003). We leave 
the proofs of these statements as exercises to the reader. 

Since NP-hardness applies to almost all versions of the model, there is not much we 
can do about it; instead, we will try to identify broad subclasses of the models that are 
not susceptible to strong inapproximability results, and for which good approximation 
results can be obtained. 


Submodularity as a route to good approximations. While we will not go into the 
details of the inapproximability proofs, they rely on constructing a cascade process 
for which the resulting influence function f has a “knife-edge” property: as one adds 
nodes to S, one initially gets very little spreading, but once exactly the right set has been 
added, the process suddenly spreads very widely. And as we will see shortly, this is 
actually crucial, since influence functions f that grow ina less pathological way allow 
for good approximation algorithms. The key to this is a property called submodularity. 
We say that a function f is submodular if adding an element to a set Y causes 
a smaller marginal improvement than adding the same element to a subset of Y. 
Specifically, f is submodular if for all sets X¥ C Y and all elements v ¢ Y, we have 


F(X U fop — FOO > FW VU {v}f) — FM). 


Thus, submodularity is a type of diminishing returns property: the benefit of adding 
elements decreases as the set to which they are being added grows. We also note that 
all the influence functions arising from our models here are monotone, in the sense that 
f(d) = Oand f(X) < f(Y) whenever X¥ CY. 

For such a function /, it is natural to hope that a simple “hill-climbing” approach 
might lead to a good approximation for the optimal value over all k-element sets: 
since the marginal benefits only decrease as elements are added, it is hard to “hide” a 
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very large optimum. In fact, this is intuition is formalized in the following theorem of 
Nemhauser, Wolsey, and Fisher (1978). 


Theorem 24.3 Let f be amonotone submodular function, and let S* be the k- el- 
ement set achieving the maximum possible value of f . Let S be a k-element set ob- 
tained by repeatedly, for k iterations, including the element producing the largest 
marginal increase in f . (We can think of S as the result of straightforward hill- 
climbing with respect to the function f .) Then f(S) > Ud — 1) f(S*) > .63 f(S*). 


This theorem will be our main vehicle for obtaining approximation algorithms for 
influence maximization: we will identify instances of the Cascade Model for which 
the influence function is submodular (it is always monotone for the models we con- 
sider), and this will allow us to obtain good results from hill-climbing. There is one 
further wrinkle in our use of Theorem 24.3, however: it is computationally intractable, 
even in simple special cases, to evaluate the function f exactly. Fortunately, one can 
adapt Theorem 24.3 to show that for any ¢ > 0, if one can approximately evaluate 
f sufficiently accurately (relative to the granularity taken by the values of f, which 
are integers in our case), this approximate evaluation of f produces a set S such that 
f(S) >a —- i — €)f(S*). As we can achieve such approximate evaluation for the in- 
fluence functions we are considering here, this will allow us to apply this approximate 
form of Theorem 24.3. 

We now proceed with the search for broad special cases of the Cascade Model for 
which the influence function f is submodular. 


Diminishing returns and submodularity. It is perhaps easier to think first of instances 
for which the resulting function f is not submodular. For example, suppose that every 
node v has a sharp threshold £ > 1: v requires at least £ active neighbors before v itself 
becomes active. Then it is easy to construct graphs G on which f(S) remains small 
until a sufficient number of nodes have been included in S, and then it abruptly jumps 
up; such a function is not submodular. 

More generally, instances based on critical mass effects at the level of individual 
nodes tend not to yield influence functions that are submodular. But is the opposite 
true? Does diminishing returns at the level of individual nodes imply that the influence 
function f is submodular? In fact, the following theorem of Kempe, Kleinberg, and 
Tardos (2003, 2005) establishes a general sense in which this is the case. 


Theorem 24.4 For any instance of the Cascade Model in which all the incre- 
mental functions p, exhibit diminishing returns, the resulting influence function 
f is submodular. 


An appealing feature of this theorem is the way in which it establishes a “local-to- 
global” link: if each individual node experiences influence in a way that exhibits 
diminishing returns, then the network as a whole experiences influence (from an 
external marketer) in the same way. The search for such cases in which local behavior 
implies analogous global behavior is a theme throughout social network research, and 
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finding common principles underlying such effects is an interesting general research 
issue. 

The proof of Theorem 24.4, while not long, is a bit intricate; so instead we describe 
the proof of a special case of this theorem, where the analysis is conceptually very 
clean, yet still illustrative of some of the issues involved in proving submodularity. 
Recall that in the Independent Cascade Model, defined in the previous section, we 
define the incremental functions via p,(u, X) = p,,: in other words, the influence of 
u on v depends only on u and v, not on the set of other nodes that have already tried 
and failed to influence v. 


Theorem 24.5 For any instance of the Independent Cascade Model, the result- 
ing influence function f is submodular. 


PROOF Much of the challenge in proving submodularity here is that the com- 
putational intractability of our function f also translates into a kind of conceptual 
intractability: we do not know enough about its structure to simply “plug it in” to 
the submodular inequality and hope to verify it directly. A much more powerful 
approach is instead to decompose f into simpler functions, and check submodu- 
larity for the parts of this decomposition. 

To lay the groundwork for this plan, we start by discussing two useful facts 
about submodularity. The first is this: if f;, fo,..., f; are submodular functions, 
and c},...,C, are nonnegative real numbers, then the function f defined by 
the weighted sum f(X) = >)-\_, ci f;(X) is also submodular. The second fact 
is actually the identification of a simple, useful class of submodular functions, 
the following. Suppose that we have a collection of sets Ci, C2,...,C,, and 
for a set X C {1,2,...,7r}, we define f(X) = | Ujex C;|. For obvious reasons, 
we will call such a function f a size-of-union function. Then it is not hard 
to verify that any size-of-union function is submodular. As we will see be- 
low, such functions will arise naturally in our decomposition of the influence 
function /. 

Now, consider the following alternate way of viewing the Independent 
Cascade Process. In the standard view, each time a node u becomes ac- 
tive, it flips a coin of bias p,, to determine whether it succeeds in activat- 
ing uv. Now, in the alternate, equivalent view of the process, suppose that for 
each edge (u,v), we flip a coin of bias p,, in advance, planning to only 
consult the outcome of the coin flip if we ever need to, when u becomes 
active. 

If there are m edges in the graph, then there are 2” possible collective outcomes 
of the coin flips. Let a denote a particular one of these 2” outcomes, and let f,(S) 
denote the eventual number of activated nodes, given that S is the initial active 
set and a is the outcome of the coin flips. Unlike f, the function f, is easy to 
understand, as follows. For each edge (u, v), we say that it is live (with respect 
to a) if the advance coin flip came up heads. For a node s, we let R” denote 
the set of all nodes that are reachable from s on paths consisting entirely of live 
edges. It is now easy to check that a node is eventually activated if and only if it 
is reachable from some node in S by some path consisting entirely of live edges, 
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and hence 


fa(S) = 


Ur 


ses 


In other words, each f, is a size-of-union function, and hence submodular. 
But this is essentially all we need, since by definition 


f(S) =} 5 Probl] - fa(S). 


That is, f is a nonnegative weighted sum of submodular functions, and hence f 
is subdmodular, as desired. 


We close this particular discussion by noting that Theorem 24.4 is not the most 
general possible formulation of the local-to-global principle we have been discussing. 
In particular, if all the incremental functions p, in an instance of the Cascade Model 
exhibit diminishing returns, then in the equivalent instance of the General Threshold 
Model, the resulting threshold functions g, are submodular (a related notion of di- 
minishing returns). However, the converse does not hold: there exist instances of the 
General Threshold Model with submodular threshold functions g,, for which the equiv- 
alent instance of the Cascade Model has incremental functions p, that do not satisfy 
diminishing returns. (An important example is the Linear Threshold Model, which does 
not translate into an instance of the Cascade Model with diminishing returns. Despite 
this, one can show via a separate analysis that influence functions f arising from the 
Linear Threshold Model are always submodular.) 

Thus, the General Threshold Model with submodular threshold functions is strictly 
more general than the Cascade Model with incremental functions satisfying diminishing 
returns. Hence the following very recent result of Mossel and Roch (2007), proving a 
conjecture of Kempe et al. (2003), generalizes Theorem 24.4. 


Theorem 24.6 For any instance of the General Threshold Model in which all 
the threshold functions g, are submodular, the resulting influence function f is 
submodular. 


Further direction: Alternate marketing strategies. We conclude this section by 
briefly discussing a different strategy through which a marketer could try to take 
advantage of word-of-mouth effects. Rather than targeting nodes, as we have been 
discussing thus far, one could instead target edges: each time an individual u buys a 
product, an incentive is offered for u to recommend the product to a friend v. A number 
of online retailers have constructed recommendation incentive programs around this 
idea: for example, each time you buy a product, you are given the opportunity to send 
an e-mail to a friend with a special offer to buy the product as well; if the friend goes 
on to buy it, each of you receives a small cash refund (Leskovec et al., 2006a, 2006b). 

Strategies of this type have a different flavor from the targeting of nodes: rather 
than trying to create a large cascade by influencing initial adopters, one tries to create 
a large cascade in effect by amplifying the force with which influence is transmitted 
across edges. (Clearly, one still needs some initial adopters as well for this to succeed.) 
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While there has been empirical work on the outcomes of particular recommendation 
incentive programs, it is an open question to analyze theoretical models for the result 
of such incentives for the problem of maximizing influence. 


24.5 Empirical Studies of Cascades in Online Data 


As we noted at the outset, there has been a huge amount of empirical work on diffusion 
in social networks over the past half century. A crucial challenge in this research is the 
fact that the phenomena being studied — the ways in which social network links affect 
adoption of innovations — are very hard to measure and assess. Most research of this 
type has focused on small or moderately sized groups, which are then studied in detail, 
and the resulting analysis has provided fundamental insights into subtle issues such as 
the characteristics of adopters at different stages of the innovation. 

The theoretical models we have been discussing thus far are motivated at a qualitative 
level by this empirical work, but it remains an important problem to relate the models 
to real diffusion data at a more precise, quantitative level. One reason why it has been 
difficult to do this stems from the type of data available: While existing empirical 
studies can address fairly rich questions at the scale at which they operate, the resulting 
datasets tend to be too small to provide accurate estimates of basic quantities needed for 
assessing the validity of the theoretical models — for example, how adoption probability 
depends on structural properties of a node’s network neighbors. What is needed for 
this task are large datasets tailored to provide answers to such questions with limited 
noise. 


Diffusion data from online communities. Very recently, large online datasets from 
several sources have produced measurements that raise interesting connections to the 
theoretical models. One such study was performed on the online blogging and social 
networking site LiveJournal (Backstrom et al., 2006). LiveJournal has several million 
members and several hundred thousand user-defined communities; members maintain 
individual Web pages with personal information, blogs, and — most importantly for our 
purposes here — lists of their friends in the system and the communities to which they 
belong. 

From the lists of friends, we can construct an underlying social network, with an 
edge (v, w) if v lists w as a friend. We then treat each community as a behavior that 
diffuses through this network: since communities grow by acquiring members over 
time, we can study how a member’s likelihood of joining a group depends on the 
number of friends he or she has in the group. 

Here is a concrete way of formulating this question. At two times f, and ft, (a few 
months apart), snapshots are taken of the social network and community memberships 
on LiveJournal. Now, for each number k > 0, consider the set U; of all pairs (u, C) 
such that user u did not belong to community C at time f,, but had uw had k friends in 
C at t,. We let P(k) denote the fraction of pairs (u, C) in the set U, for which u had 
joined C by time hy. In this way, P(k) serves as an answer to the question: what is the 
probability of joining a LiveJournal community, given that you had k friends in it at an 
earlier time? 


628 CASCADING BEHAVIOR IN NETWORKS 


Probability of joining a community when k friends are already members 
0.025 T T T T T T T T T 


0.015 + 4 


Probability 


0.005 - 7 


Figure 24.1. The probability of joining a LiveJournal community as a function of the number 
k of friends in the community at an earlier point in time. Error bars represent two standard 
errors. 


Figure 24.1 shows a plot of P(k) as a function of k for the LiveJournal data. A 
few things are quickly apparent from the plot. First, the dependence on k is clearly 
dominated by a diminishing returns effect, in which P(k) grows less quickly as k 
increases. Indeed, this dependence is quite smooth, with a good fit to a function of the 
form P(k) = ¢ logk up to moderately large values of k — in particular, this means that 
P(k) continues increasing even as k becomes fairly large. Finally, there is an initial but 
significant deviation from diminishing returns for k = 0, 1,2, with P(2) > 2P(1). In 
other words, having a second friend in a community gives a significant boost to the 
probability of joining, but after that the diminishing returns effect takes over. 

Similar diminishing returns effects have been observed recently in other large-scale 
datasets that exhibit diffusion-style processes — for example, the probability of publish- 
ing a paper at a computer science conference as a function of the number of coauthors 
who have previously published there (Backstrom et al., 2006); or the probability of pur- 
chasing a DVD in a recommendation incentive program run by a large online retailer, 
as a function of the number of friends who sent an e-mail recommendation (Leskovec 
et al., 2006). Given how a common effect — diminishing returns — is appearing in 
large-scale data from such diverse sources, it is an interesting open question to try find- 
ing a reasonable mechanism to explain this, potentially including the approximately 
logarithmic functional form of P(x) in terms of k. 

Closer analysis of the LiveJournal data also reveals more subtle effects that contribute 
to diffusion in a significant way. One that is particularly striking is the connectedness 
of a person’s friends. If we look at the pairs in U; — recall that this consists of users 
with k friends in a community they do not initially belong to — it turns out that a user is 
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significantly more likely to join a community if his or her k friends have many edges 
among themselves than if they do not. In other words, a highly connected set of friends 
in a community exerts a greater “gravitational force” than a comparable number of 
friends who are not as well connected among themselves. It is an interesting open 
question to understand the underlying factors that lead to this effect, to investigate how 
broadly it applies by considering related datasets, and to think about ways of extending 
the theoretical models to incorporate effects such as the connectedness of a node’s 
neighbor set. 


Relating the empirical and theoretical models. The results shown in Figure 24.1 
provide a good means of elaborating on the point made at the start of this section, that 
diffusion data from online sources, at very large scales, provides both more and less 
than one finds in classical diffusion studies. It provides less in the sense that we know 
very little about who individual users on LiveJournal are, what the links between them 
mean, or what motivates them to join communities. A LiveJournal user may have a 
link to a close friend, or to someone they have barely met, or simply because they are 
trying to accumulate as many links on the system as they can. Given this, it is very hard 
to ask the kind of nuanced questions that one sees in more traditional diffusion studies, 
which deal with smaller datasets for which they have assembled (often at great effort) 
a much clearer picture. On the other hand, the fact that the curve in Figure 24.1 is so 
smooth is precisely the result of having a dataset large enough to contain hundreds 
of thousands of communities diffusing across a network of millions of users — on a 
dataset containing just hundreds of individuals, any curve representing P(k) will be 
extremely noisy. Indeed, given how many different things the links and community 
memberships on LiveJournal mean to different users, the clean logarithmic form of the 
resulting curve is perhaps all the more striking, and in need of deeper explanation. 

That is a first caveat. A second is that there remains a significant challenge in relating 
curves like the one in Figure 24.1 to the theoretical models in the earlier sections. The 
models we discussed there all had a discrete, operational flavor: each node follows 
a fixed probabilistic rule, and it uses this rule to incorporate information from its 
neighbors over time. In contrast, the curve in Figure 24.1 is produced by observing the 
full system at one point in time, and then returning to see what has changed at a later 
point in time. The dependence of P(k) on k expressed in this way reflects an aggregate 
property of the full population, and does not imply anything about any particular 
individual’s response to their friends’ behaviors. Even for a specific individual, we 
do not know when (or even if) they became aware of their friends behavior between 
these two points of time, nor when this translated into a decision by them to act. 
In particular, this makes it hard to determine how the notion of diminishing returns 
captured in Figure 24.1 is actually aligned with the formal definition of diminishing 
returns in Sections 24.3 and 24.4. It is a general and interesting question to explore 
frameworks that can incorporate this asynchrony and uncertainty about the way in 
which information flows and is acted on by nodes in a social network, leading to a 
closer integration of the theoretical and empirical results. 

As novel kinds of online information systems continue to proliferate, one sees dif- 
fusion processes not just forming part of the underpinnings of the system, but in many 
cases built directly into the design as well, in settings such as social search, media 
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sharing, or community-based question-answering. As part of this process, extremely 
rich data are becoming available for studying diffusion processes in online environ- 
ments, at a large scale and with very fine time resolution. These developments are 
forming a feedback loop that will inevitably drive the formulation of richer theories, 
and enable us to pose more incisive questions about the ways in which information, 
influence, and behaviors of all kinds spread through human social networks. 


24.6 Notes and Further Reading 


The general topic of diffusion in social networks is discussed in the books by Coleman, 
Katz, and Menzel (1966), Rogers (1995), and Valente (1995), and the survey by 
Strang and Soule (1998). Granovetter (1978) and Schelling (1978) provide some of 
the early mathematical models for these processes. Early game-theoretic formulations 
of diffusion models were proposed by Blume (1993) and Ellison (1993), and they 
form part of the focus of a book by Young (1998). The specific model and results in 
Section 24.2 are from Morris (2000), and further game-theoretic models of diffusion 
have been explored by Jackson and Yariv (2005). There are also connections at a 
technical level between the models used in studying diffusion and some of the more 
graph-theoretic techniques that have been applied to evolutionary game theory; see for 
example the survey by Lieberman, Hauert, and Nowak (2005) and the paper by Kearns 
and Suri (2006). 

Models for diffusion are also closely related to work the topic of contact processes 
and particle systems studied in the area of probability (Durrett, 1988; Liggett, 1985), 
as well as to the long history of work in mathematical epidemiology, which studies 
the dynamics of biological (as opposed to social) contagion (Bailey, 1975). There has 
been a recent line of work aimed at relating such probabilistic contagion models more 
closely to the underlying network structure; see for example the recent work of by 
Pastor-Satorras and Vespignani (2000), Newman (2002), and Alon, Benjamini, and 
Stacey (2004). 

The problem of finding the most influential set of k individuals, as discussed in 
Section 24.4, was posed, together with the viral marketing motivation, by Domingos 
and Richardson (2001). The search for influential nodes in networks of blogs and news 
sources has been considered by Adar et al. (2004), Gruhl et al. (2004), and Kumar et 
al. (2004). 

The approximation result in Section 24.4 is due to Kempe, Kleinberg, and Tardos 
(2003, 2005); the general theorem on which it depends, that hill-climbing provides a 
good approximation for arbitrary monotone submodular functions, is due to Fisher, 
Nemhauser, and Wolsey (1978). The formulations of the models in Section 24.3 are 
also from Kempe et al. (2003), with closely related models proposed independently by 
Dodds and Watts (2004). 

Recommendation incentive programs, discussed at the end of Section 24.4, are 
studied empirically by Leskovec, Adamic, and Huberman (2006) and by Leskovec, 
Singh, and Kleinberg (2006); the development of theoretical models for such systems 
remains largely an open question. The study of diffusion processes on LiveJournal 
discussed in Section 24.5 is from Backstrom, Huttenlocher, Kleinberg, and Lan (2006), 
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where a number of more subtle features of diffusion on LiveJournal are investigated as 
well. The identification of diminishing returns effects in recommendation incentives is 
from Leskovec, Adamic, and Huberman (2006). 
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Exercises 


24.1 Prove Claim 24.1 in the proof of Theorem 24.1. 


24.2 The first model we considered in Section 24.2 was based on a networked coor- 
dination game in which, across each edge (v, w), nodes v and w each receive a 
payoff of q if they both choose behavior A, they each receive a payoff of 1 — q if 
they both choose behavior B, and they each receive a payoff of 0 if they choose 
opposite behaviors. Let us call this a coordination game with parameter q. 

It is natural to ask what happens if we consider a more general kind of co- 
ordination game on each edge. Suppose in particular that, on each edge (v, w), 
node v receives payoff uxy if it chooses strategy X while w chooses strategy Y, for 
any choice of X € {A, B} and Y € {A, B}. Moreover, to preserve the coordination 
aspect, we assume that it is still better to play matching strategies: u4, > Ug and 
UBpB > UaB. 

While this is indeed a more general kind of game, prove that the results on the 
contagion threshold remain the same. Specifically, prove that for any infinite graph 
G with finite node degrees, and for any choice of payoffs {u,,4, UgA, UAB, UBB} 
satisfying u4q > Uga and Ugg > Uys, there exists a real number q such that the 
following holds: A finite set 5 is contagious in G with respect to the coordination 
game defined by {u,,, Uga, Uas, Use} if and only if it is contagious in G with 
respect to the coordination game with parameter q. 


24.3 (a) In Section 24.4, we considered the problem of finding a set S of k nodes that 
maximizes the expected number of activated nodes f(S). Show that for some 
class of instances of the General Threshold or Cascade Model, finding the 
optimal set 5 is NP-hard. 

(b) For some class of instances of the General Threshold or Cascade Model, show 
that in fact it is NP-hard to approximate the optimal value of f(S) to within a 
factor of n'~* for any ¢ > 0, where n is the number of nodes. 


CHAPTER 25 


Incentives and Information 
Security 


Ross Anderson, Tyler Moore, Shishir Nagaraja, 
and Andy Ozment 


Abstract 


Many interesting and important new applications of game theory have been discovered over the past 
7 years in the context of research into the economics of information security. Many systems fail not 
ultimately for technical reasons but because incentives are wrong. For example, the people who guard 
a system often are not the people who suffer the full costs of failure, and as a result they make less 
effort than would be socially optimal. Some aspects of information security are public goods, like 
clean air or water; externalities often decide which security products succeed in the marketplace; and 
some information risks are not insurable because they are correlated in ways that cause insurance 
markets to fail. 

Deeper applications of game-theoretic ideas can be found in the games of incomplete information 
that occur when critical information, such as about software quality or defender efforts, is hidden from 
some principals. An interesting application lies in the analysis of distributed system architectures; it 
took several years of experimentation for designers of peer-to-peer systems to understand incentive 
issues that we can now analyze reasonably well. Evolutionary game theory has recently allowed 
us to tie together a number of ideas from network analysis and elsewhere to explain why basing 
peer-to-peer systems on rings is a bad idea, and why revolutionaries use cells instead. The economics 
of distributed systems looks like being a very fruitful field of research. 


25.1 Introduction 


Over the last 7 years, people have realized that security failure is caused at least as 
often by misaligned incentives as by technical design mistakes. Systems are particularly 
prone to failure when the person guarding them is not the person who suffers when they 
fail. The tools and concepts of game theory and microeconomic theory are becoming 
just as important as the mathematics of cryptography to the security engineer. 

In this chapter, we present several live research challenges in the economics of 
information security, many of which bear on problems in various branches of game 
theory. We first consider misaligned incentives, and externalities: network insecurity is 
somewhat like air pollution or traffic congestion, in that people who connect insecure 
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machines to the net do not bear the full consequences of their actions and so do not make 
a socially optimal investment in protection. Next we examine the role of asymmetric 
information and the capacity for hidden action: games where one principal has more 
knowledge of the game state than her opponent, or games where she can make moves 
that become known only with a certain probability. 

The difficulty in measuring information security risks presents another challenge: 
these risks cannot be better managed until they can be better measured. Auctions and 
markets can help in various ways to measure the security of software and thereby 
reduce the information asymmetry prevalent in the software industry. We also examine 
the problem of insuring against attacks. The local and global correlations exhibited by 
different attack types largely determine whether an insurance market in the associated 
risks is feasible. 

The structure of computer networks can also have a great impact on player incentives. 
One topical example is that the effort devoted to censorship resistance in peer-to-peer 
systems depends upon whether the application design empowers players to choose 
which files to share or randomly distributes them. This realization enables us to model 
solidarity in networks that may come under selective attack. 

Aneven more striking example is how network topology can exacerbate the impact of 
viruses or susceptibility to targeted attacks. The regular networks, or random networks, 
commonly used in modeling do not behave the same way as real-world networks, 
which are better approximated by scale-free models. Scale-free networks turn out to 
be more robust against random failure but more vulnerable to targeted attack. We 
finally present a model that uses ideas from evolutionary game theory to explore the 
interaction between attack and defense strategies, and we provide a framework for 
evaluating strategies in networks where topology matters. 


25.2 Misaligned Incentives 


One of the observations that drove initial interest in security economics came from 
banking. In the United States, banks are generally liable for the costs of card fraud; 
when a customer disputes a transaction, the bank must either show that she is trying to 
cheat them or refund her money. In the United Kingdom, the banks had a much easier 
ride: they could often get away with claiming that the ATM system was “secure,” so 
a customer who complained must be mistaken or lying. “Lucky bankers,” one might 
think; yet it turned out that UK banks spent more on security and suffered more fraud. 
How could this be? Banks appear to have been suffering from a moral-hazard effect: 
UK bank staff knew that customer complaints would not be taken so seriously, so they 
became lazy and careless. This situation led to an avalanche of fraud. 

Another observation came from the state of the antivirus software market around 
the year 2000. People were not spending as much money on protecting their computers 
from infection as would have been ideal. Why not? Well, at that time, a typical virus 
payload was a service-denial attack against the Web site of a company like Amazon 
or Microsoft. While a rational consumer might well spend $20 to prevent a virus from 
trashing his hard disk, he might not do so just to prevent an attack on Bill. 

Legal theorists have long known that liability should be assigned to the party that 
can best manage the risk. Yet everywhere we look, we see online risks that are poorly 
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allocated. The result is privacy failures and even protracted regulatory tussles. For 
example, the United States has seen widespread debate about medical privacy over 
the last 10 years: from the passage of the Health Insurance Portability and Account- 
ability Act, through the initial regulations made under the Act by President Clinton, 
the later regulations made by President Bush, and the recent claims that the law fails 
to protect health privacy while providing a gold-mine for security vendors. The root 
problem is that medical information systems are purchased by hospital directors and 
insurance companies, whose interests in account management, cost control and re- 
search are not well aligned with the patients’ interests in the privacy of their health 
records. 


25.2.1 Applications of Game Theory 


Game theory can provide a means of better understanding the outcome of security 
decisions made by self-interested individuals. Information security levels often depend 
on the efforts of many principals, leading to suboptimal security investment whenever 
decisions are uncoordinated. The level of security investment generally depends on the 
investor’s own costs and benefits, the investment decisions of others, and the manner 
in which individual investment translates to outcomes. System reliability can depend 
on the sum of individual efforts, the minimum effort invested, or the maximum effort 
invested. Programming, for example, might be down to the weakest link (the most 
careless programmer introducing a fatal vulnerability) while software validation and 
vulnerability testing might depend on the sum of everyone’s efforts. There can also be 
cases where the security depends on the best effort — the effort of a star cryptanalyst. 
These different models have interesting effects on whether an appropriate level of 
defense can be provided and what policy measures are advisable. 

A simple model by Varian provides interesting results when players choose their 
effort levels independently. For the total-effort case, system reliability depends on the 
agent with the highest benefit-cost ratio, and all other agents free ride. In the weakest- 
link case, the agent with the lowest benefit-cost ratio dominates, since any additional 
effort is wasted. Systems become increasingly reliable in the total-effort case as more 
agents are added, but they become increasingly unreliable in the weakest-link case. 
What are the implications? One is that software companies should hire more software 
testers but fewer (more competent) programmers. 

Work such as this has inspired other researchers to consider interdependent risk. A 
recent influential model by Kunreuther and Heal notes that the security of a group often 
rests on each of its members: an individual taking protective measures creates positive 
externalities for others that in turn may discourage their own investment. This result has 
implications far beyond information security. The decision by one apartment owner to 
install a sprinkler system affects his neighbors’ decisions to install their own systems; 
airlines may decide not to screen luggage transferred from other carriers who are 
believed to be careful with security; and people thinking of vaccinating their children 
against a contagious disease may choose to free-ride off the herd immunity instead. 
In each case, several widely varying Nash equilibrium outcomes are possible, from 
complete adoption to total refusal, depending on the levels of coordination between 
independent actors. 
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25.2.2 Network Effects and Deployment 


Game theory is also used to ascertain how network effects impact the level of security 
investments. In particular, many security technologies face bootstrapping problems. 
The benefit that these technologies provide to players is dependent upon the number 
of players that adopt the technology. A bootstrapping problem exists because the cost 
of the technology is greater than the benefit until a minimum number of players adopt. 
As a result, each player waits for other players to go first, and the technology is never 
deployed. 

Following the seminal work of Katz and Shapiro, a number of economists have 
examined the problem of deploying a technology that exhibits network effects. Most 
of this literature concludes that adoption is a coordination problem. The challenge 
is to coordinate the different players and to enforce their cooperation. However, the 
assumptions used in these models do not apply to many security technologies. For 
example, security technologies that are software-based can often be deployed rapidly, 
while the economics literature is concerned with coordinating players who must make 
their decisions far in advance of a slow-moving deployment. Furthermore, security 
technologies may not provide special benefits to early adopters. 

This area is especially topical at the moment. A number of core Internet protocols 
are considered insecure, such as DNS and routing. More secure protocols exist; the 
challenge is to bootstrap their adoption. Two examples of security protocols that have 
already been widely deployed are SSH and IPsec. Both of these protocols overcame the 
bootstrapping problem because they could provide significant intraorganizational ben- 
efits (X session support and VPNs). Adoption was thus driven by organizational needs 
rather than the benefit that players derived from the global network. The deployment 
of fax machines also occurred through this mechanism: companies initially bought fax 
machines to connect their own offices. Limiting the players in a game to the members 
of some kind of club can also have interesting effects on other aspects of security, as 
we see below. 


25.3 Informational Asymmetries 


We now consider two types of informational asymmetries relevant to information 
security: hidden action, where the difficulty of observing others’ actions facilitates 
certain attacks; and hidden information, where the difficulty of measuring software 
security has caused vendors to underinvest in quality. 


25.3.1 Hidden-Action Attacks 


In the theory of asymmetric information, a hidden-action problem arises whenever two 
parties wish to transact, but one party can take unobservable actions that impact the 
transaction. The classic example comes from insurance, where the insured party may 
choose to behave recklessly (which in turn increases the likelihood of a claim) because 
the insurance company cannot observe her behavior. Crossing to the security domain, 
this idea generalizes to a class of hidden-action attacks, which are attractive precisely 
because observation (and therefore punishment) is unlikely. 
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Computer networks are naturally susceptible to hidden-action attacks. Routers can 
quietly drop selected packets or falsify responses to routing requests; nodes can redirect 
network traffic to eavesdrop on conversations; and players in file-sharing systems can 
hide whether they have chosen to share with others, so some may choose to “‘free-ride” 
rather than to help replenish the system. The common element in these examples is 
that nodes can hide malicious or antisocial behavior from other network elements. 

Hidden-action attacks may occur whenever the net utility gain from deviation is 
greater than the expected penalty enforced when observation is unlikely and less than 
the expected penalty enforced when observation is likely. (If the expected gain from 
an attack does not exceed the expected penalty even when actions are likely to remain 
hidden, then no attack should occur. If the expected gain in attacking exceeds the 
expected penalty even when observed, then the attack should be launched regardless 
of whether or not observation is likely.) 

In the economics literature, hidden-action problems are dealt with by structuring 
contracts to induce proper behavior. For example, auto insurers use deductibles to 
mitigate driver recklessness. By charging customers to file a claim, insurers create 
an incentive for taking reasonable steps to avoid negative outcomes. The need for 
observation is eliminated, though not without cost: everyone has to pay, even when the 
insured did not act recklessly. Mechanism design, as discussed throughout the rest of 
this book, attempts to create systems that align all of the agents’ incentives so that the 
agents’ best interest is to operate as intended. A complementary approach is to alter 
the topology and structure of the interactions to increase observability. 

One telling example comes from peer-to-peer systems. These exploit network exter- 
nalities to the fullest by having large member populations with a flat topology: joining 
one creates the potential for collaboration with every other peer in the system. High 
turnover is also expected; nodes may join and leave the system rapidly. These proper- 
ties lower the prospects for repeated interactions, which in turn makes cheating more 
likely. Inexpensive or even costless identities exacerbate the problem of unrepeated 
interactions while also making penalties harder to implement. In a network with these 
properties, nodes are predisposed to hidden action. 

One solution is to change the network topology. In most peer-to-peer systems, any 
node can transact with any other on joining the network. While this flat topology 
maximizes transaction possibilities, it makes repeated transactions unlikely and ob- 
servation difficult. An alternative is to adopt a network topology based on clubs of 
nodes with common interests. Here, nodes first transact with other members of their 
club to establish legitimacy. Once trust has been established inside the cluster, outside 
transactions can happen through established channels between groups. Such a topology 
facilitates self-enforcement by establishing a credible threat of observation to forestall 
hidden action, and by creating long-lived principals (clubs) against whom sanctions 
hurt. 

Social networks can also be used to create better topologies. When honest players can 
select their friends as neighbors rather than having their neighbors randomly assigned, 
they minimize the informational asymmetry present during neighbor interactions. This 
can raise the cost of entry for an attacker as well as align the incentives between normal 
players. However, social networks can also lead to inefficient outcomes as players may 
not be exposed to diverse information and isolated players may be marginalized. 
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25.3.2 Hidden Information: Measuring Software Security 


Another information asymmetry in information security is caused by our inability to 
effectively measure the security of software. Most commercial software contains design 
and implementation flaws that could easily have been prevented. Although vendors are 
capable of creating more secure software, the economics of the software industry 
provide them with little incentive to do so. In many markets, “ship it Tuesday and get it 
right by version 3” is perfectly rational behavior. Consumers generally reward vendors 
for adding features, for being first to market, or for being dominant in an existing 
market. These motivations clash with the goal of writing more secure software, which 
requires time-consuming testing and a focus on simplicity. Nonetheless, the problems 
of software insecurity, viruses, and worms are frequently in the headlines; why does 
the potential damage to vendor reputations not motivate them to invest in more secure 
software? 

Vendors’ lack of motivation is readily explained: the software market is a “market 
for lemons.” In a Nobel prize-winning work, economist George Akerlof employed the 
used car market as a metaphor for a market with asymmetric information. His paper 
imagines a town in which 50 good used cars (worth $2,000) are for sale, along with 
50 “lemons” (worth $1,000 each). The sellers know the difference but the buyers do 
not. What is the market-clearing price? One might initially think $1,500, but at that 
price no-one with a good car will offer it for sale; so the market price quickly ends up 
near $1,000. Because buyers are unwilling to pay a premium for quality they cannot 
measure, only low quality used vehicles are available for sale. 

The software market suffers from the same asymmetry of information. Vendors may 
have some intuition about the security of their products, but buyers have no reason to 
trust them. In some cases, even the vendor might not have a truly accurate picture of 
its software’s security. As a result, buyers have no reason to pay the premium required 
to obtain more secure software, and vendors are disinclined to invest in protection. 

Three broad research approaches have attempted to provide useful measures of 
the security of software: statistical, market-based, and insurance-based. The former 
approach relies on the application of reliability growth models to vulnerabilities and is 
not be discussed here. The latter two approaches are discussed below. 


25.3.3 Market-Based Approaches 


One possible way to measure the security of software is to rely on a market: let buyers 
and sellers establish the actual cost of finding a vulnerability in software or merely 
estimate the security of software according to their own knowledge. For example, 
banking standards for PIN-entry terminals specify a minimum cost of various kinds of 
technical compromise. 

In the software business, open markets for reports of previously undiscovered vul- 
nerabilities could provide a security metric. The bid, ask, and most recent sale prices 
in such a market approximate the labor cost to find a vulnerability. These prices can 
establish which of two products the market deems to have vulnerabilities that are less 
expensive to find. Alternatively, a vulnerability market of this type could be designed 
as an auction. 
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Several organizations are now actively purchasing vulnerabilities, so an open market 
or auction actually exists. Unfortunately, these organizations are not publishing their 
prices. Their business model is to provide the vulnerability information simultaneously 
to their customers and to the vendor of the affected product (in contrast to the previous 
practice of waiting until after a patch is released and then making the existence of the 
vulnerability public). The business models of these organizations are thus not socially 
optimal: they always have an incentive to leak vulnerability information without proper 
safeguards. 

A market for software security derivatives could also enable security professionals 
to reach a price consensus on the level of security for a product. Contracts could be 
issued in pairs: the first pays a fixed value if no vulnerability is found in a program 
by a specific date, and the second pays the same value if vulnerabilities have been 
found in that program by that date. If these contracts can be issued as desired and 
traded via some market, then their trading price indicates the consensus opinion on the 
security of the program. Software security derivatives could thus conceivably be used 
to hedge risks by software vendors, players, software company investors, and insurance 
companies. 


25.3.4 Insurance-Based Approaches 


Another approach to measuring the security of software is to rely on insurers. The ar- 
gument for insurance is that cyber-insurance underwriters assign premiums based upon 
a firm’s IT infrastructure and the processes by which it is managed. This assessment 
results in both detailed best practices and, over the long run, a pool of data by which the 
insurance company can accurately assign a monetary value to the risks associated with 
certain practices or software. At the moment, however, the cyber-insurance market is 
both underdeveloped and underutilized. Why should this be? 

One reason is the problem of interdependent risk, which takes at least two forms. 
Firms are ‘physically interdependent’ because their IT infrastructure is connected via 
the Internet to other entities — which implies that the work a firm performs to secure 
itself may be undermined by failures at other firms. Firms are “logically interdependent” 
because cyber attacks often exploit a vulnerability in a system used by many firms. For 
example, viruses or worms may have a global impact upon a specific software platform. 
This interdependence makes certain cyber-risks unattractive to insurers — particularly 
those where the risk is globally rather than locally correlated, such as worm and virus 
attacks, and systemic risks such as Y2K. We note in passing that many writers have 
called for cyber-risks to be transferred to the responsible software vendors; if this were 
the law, it is unlikely that Microsoft would be able to buy insurance. So far, vendors 
have succeeded in dumping almost all risk; but this outcome is also far from being 
socially optimal. 

Because a firm’s security depends in part on the efforts of others, firms underinvest in 
both security technology and in cyber insurance. At the same time, insurance companies 
must charge a higher premium because the risks against which they are insuring are 
highly correlated: this higher premium may prevent the vast majority of firms from 
adequately insuring themselves. As a result, cyber insurance markets may lack the 
volume and liquidity to become economically efficient. 
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25.4 The Economics of Censorship Resistance 


We have seen that misaligned incentives and information asymmetries are important 
problems in information security that are amenable to a game theoretic analysis. An- 
other such problem is censorship resistance. 

Early peer-to-peer systems were oriented toward censorship resistance rather than 
music file sharing. They put all content into one pot, with the effect that quite different 
groups would end up protecting each others’ free speech — be they Falun Gong mem- 
bers, critics of scientology, or aficionados of sado-masochistic imagery that is legal 
in California but banned in Tennessee. The question then arises whether such groups 
might not be better off with their own peer-to-peer systems. Perhaps they would fight 
harder to defend their own type of dissident, rather than people involved in struggles 
in which they had no interest and where they might even be disposed to side with the 
censor. In the file-sharing context, it might make sense to have a constellation of fan 
clubs, rather than one huge system — as musicians take widely different views of music 
sharing, remixing and other activities on the fringes of classical copyright practice. 

Such questions are also of topical interest to social theorists and policy people, who 
wonder whether the growing diversity of modern societies is undermining the social 
solidarity on which modern welfare states are founded. A related question in guerrilla 
warfare is when combatants should aggregate or disperse. 

We find peer-to-peer systems providing a “single pot,’ with widely and randomly 
distributed functionality, such as Eternity, Freenet, Chord, Pastry, and OceanStore. 
Other systems, like the popular Gnutella and Kazaa, allow peer nodes to serve content 
they have downloaded for their personal use, without burdening them with random files. 
The comparison between these architectures originally focused on purely technical 
aspects: the cost of search, retrieval, communications, and storage. However, it turns 
out that incentives matter here too. 


25.4.1 Red-Blue Utility Model 


Danezis and Anderson introduced the Red—Blue model to analyze the trade-off between 
diversity and solidarity in distributed systems. We consider a network of N peer nodes. 
Each node n; has a preference among two types of resource, say red and blue; one node 
might prefer to serve 20% red and 80% blue, while another prefers 80% red and 20% 
blue and the network overall contains 50% red and 50% blue. A censor who attacks 
the network tries to impose his own preference, perhaps 80% red and 20% blue. This 
action may meet the approval of some nodes, but usually not most of them. 

We assign to each node n; a preference for red 7; € [0, 1] and a preference for blue 
b; = 1 — 17; (note that r; + b; = 1). While each node likes having and serving resources, 
it prefers to have or serve a balance of resources according to its own preference 7; and 
b;. So we define the utility function of a node holding T resources out of which R are 
red resources and B are blue resources (with T = R-+ B) as 


U,(R, B) = r( Z )( S ) (25.1) 
SOs | eee ae © i eee eo 
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This is a quadratic function with its maximum at R = 7;T, scaled by the overall number 
of resources T that the node n; holds. This utility function increases as the total number 
of resources does, but is also maximal when the balance between red and blue resources 
matches the preferences of the node (R = 7r;T and B = b;T). When nodes choose the 
file distribution to serve, their utility is naturally maximized. 

Distributed hash tables and architectures such as Eternity, by contrast, scatter the 
red and blue resources randomly across all nodes n;. If the system has a total of R red 
resources and & blue resources, we can define a systemwide distribution of resources 
(rs, bs) so that each node in the system holds on average: 

R B 
to =a b, = ———. 
R+B R+B 

Each node n; has on average a utility equal to U(r,T, b,T). The utility each node 
attains in the random case is always less than or equal to the utility a node has under 
the discretionary model: 


(25.2) 


U,(7iT, biT) = Ui(rsT, bsT). (25.3) 


U;(r;T, b;T) = U;(rsT, bs T) when r, = r; and b, = b; — in other words, when the 
system’s distribution of resources aligns with a particular node’s preferences. However, 
this cannot hold true for all nodes unless they share the same preferences. Moreover, 
it is in every node’s self-interest to try to tip the balance of R and B toward its own 
preferences. With a utility function slightly more biased toward serving, the network 
could be flooded with red or blue files, depending on the dominant preference. 


25.4.2 Comparing Censorship Resistance 


We model censorship as an external entity’s attempt to impose a particular distribution 
of files r., b. on a set of nodes. The censor’s effect is not fixed; rather, it depends on 
the amount of resistance the affected nodes offer. 

Assume a node that is not receiving attention from the censor can store up to T 
resources. A node under censorship can choose to store fewer resources (T — t) and 
invest an effort level ¢ to resist censorship. We define the probability that a node 
successfully fights censorship (and reestablish its previous distribution of resources) as 
P(t). With probability 1 — P(t), the censor prevails and imposes the distribution 7,, D.. 

We first consider the discretionary case, in which nodes select the content they 
serve. Knowing the nodes’ preferences 7;, b;, the censor’s distribution r,, b,, the total 
resource bound 7, and the probability P(t) that it defeats the censor, we can calculate 
the optimal amount of resources a node invests in resisting censorship. The expected 
utility of a node under censorship is the probability of success, times the utility in that 
case, plus the probability of failure times the utility in that case: 


U = PO)U (ri (T — t), b(T — 1) +A — POU G(T — 1), b(T —1t)). (25.4) 


Our utility functions U; are unimodal and smooth, so if the functions P(t) are 


sufficiently well-behaved, there is a single optimal investment in resistance t in [0, T] 


by setting aw = 0. 


642 INCENTIVES AND INFORMATION SECURITY 


We begin with the simplest example, namely where the probability P(t) of resisting 
censorship is linear in the defensive effort ¢. Assume that if a node invests all its 
resources in fighting, it definitely prevails but has nothing left to serve any files. At 
the other extreme, if it spends nothing on lawyers (or whatever the relevant mode of 
combat) then the censor prevails for sure. Therefore we define P(t) as 


P(th= at. (25.5) 


By maximizing (25.4) with P(t) defined as in (25.5), we find that the optimal defense 
budget ty: 
_ T 2U (re, be) — Uil7i, Bi) 
2 ECD) UCSD) | 


ta (25.6) 

The node diverts ty resources from serving files to fighting censorship. We also 
assume, for now, that the cost of the attack for the censor is equal to the node’s defense 
budget f. 

We now turn to the case of Eternity or DHTs where resources are scattered randomly 
around the network, where each node is expected to hold a mixture of files r,, b,. As 
in the previous example, the utility of a node under censorship depends on its defense 
budget f, the censor’s choice of r,, b,, and the system’s distribution of files r,, b,: 


U = POW)U(rs(T — t), bs(T — 1) + — PUIG AT — 1), b-(T — t)). (25.7) 
A similar approach is followed as above to derive the optimal defense budget ¢ for 


each node: 


= T 2Ui (Fe, be) = Ui(rs, bs) 


t,= : 25.8 
2 Ui(re, b.) — Uj(rs, bs) 


However, not all nodes are motivated to resist the censor! Some may find that 
Ui(rsT, bsT) < Ui(reT, beT), which means that their utility under censorship in- 
creases. This is not an improbable situation: in a network where half the resources 
are red and half are blue (r; = 0.5, bs = 0.5) a censor that shifts the balance to r. = 0 
benefits the blue-loving nodes, and if they are free to set their own defense budgets 
then they select t = 0. 

Who fights censorship harder? The aggregate defense budget, and thus the cost of 
censorship, is greater in the discretionary model than in the random one, except in the 
case in which all nodes have the same preferences (in which case equality holds). 

For the maximum value of the defense budget ¢ to be positive in the interval [0, T], 
the following condition must be true: 


T 2Ui(re, be) z= Ui(rs, bs) 


O0< : 
2 Ui(Tre, be) _, Ui(rs, bs) 


(25.9) 


In other words, 


2Ui(e, be) < Ui(re, be). (25.10) 


COMPLEX NETWORKS AND TOPOLOGY 643 


When this is not true, a node maximizes its utility by not fighting at all and choosing 
t = 0. Given these observations, it follows that 


WES >t ig = Yt (5.11) 
ieS ieS 

Whatever the attacker’s strategy, it is at least as costly or more so, to attack a 

network’s architecture via the discretionary rather than the random model. Equality 

holds when for each node, tg = f;, which in turn means that r; = r;. This is the case 

of homogeneous preferences. In all other cases, the cost to censor a set of nodes is 

maximized when resources are distributed according to their preferences rather than 
randomly. 


25.5 Complex Networks and Topology 


The final area of information security that we discuss is the topology of complex 
networks. Computer networks from the Internet to decentralized peer-to-peer networks 
are systems of great complexity that emerge from ad hoc interactions of many entities on 
the basis of simple ground rules that are minimally restrictive. The emergent complexity, 
coupled with heterogeneity on every relevant scale, is similar to networks found “in 
the wild” — from the social networks made up from interactions between people to 
metabolic pathways in living organisms. Recently a discipline of network analysis has 
emerged at the boundary between sociology and condensed-matter physics. It takes 
ideas added from other disciplines like graph theory, which provides tools and concepts 
for modeling and investigating such networks. Our interest here is the interaction of 
network science with information security; as we shall see, we can build an interesting 
bridge to evolutionary game theory. 

Network topology can strongly influence conflict dynamics. Often an attacker tries 
to disconnect a network or increase its diameter by destroying nodes or edges, while 
the defender counters using various resilience mechanisms. Examples include a music 
industry body attempting to close down a peer-to-peer file-sharing network; a police 
organization trying to decapitate a terrorist organization; and a totalitarian government 
conducting surveillance on political activists. Police forces have been curious for some 
years about whether network science might be of practical use in covert conflicts — 
whether to insurgents or to counterinsurgency forces. 

Different topologies have different robustness properties with respect to various 
attacks. Albert, Jeong, and Barabasi famously showed that certain real-world networks 
with scale-free degree distributions are more robust to random attacks than targeted 
attacks. This is because scale-free networks — like many real-world networks — get 
much of their connectivity from a minority of nodes that have a high vertex order. This 
resilience makes them highly robust against random upsets; but remove the ‘kingpin’ 
nodes, and connectivity collapses. 

This is the static case — for example, when a police force becomes aware of a criminal 
or terrorist network, and sets out to disrupt it by finding and arresting its key personnel. 
The result of Albert et al. models this well. But what about the dynamic case — where 
at each round the attacker can remove a certain number of nodes, but the defenders can 
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recruit other nodes to replace them? How do attack and defense interact: what is the 
interplay of tactics and strategy? 

We built a simulation in which a network game is played with a number of rounds. 
Each round consists of attack followed by node replenishment and adaptation. The 
attacker can remove a proportion of nodes; his choice of nodes is his strategy. The 
defenders’ strategy lies in the adaptation phase; the way they rewire their network after 
each round of attack and replenishment. This rewiring must be done using only local 
knowledge. 

An attack strategy is more efficient, for a given defense strategy, if an attacker using 
it requires a smaller budget to disrupt the network. Similarly, a defense strategy is more 
efficient if, for a given attack strategy, it compels the attacker to expend a higher budget 
to achieve network disruption. 

We started off by considering the static attacker of Albert et al., whereby high vertex 
order nodes are removed, and a defense strategy of either random replenishment, 
forming rings, or forming cliques. In the ring strategy, defenders replace high-order 
nodes with rings — as in P2P systems such as Chord. In the clique strategy, high-order 
nodes are replaced with cliques — clusters of nodes all connected to each other. 

The results of the initial three simulations are given in Figure 25.1. 

Random replenishment (line with circles) in Figure 25.1 provides a calibration 
baseline. As seen above, it is ineffective: within three rounds the size of the largest 
connected component has fallen by a half, from 400 nodes to well under 200. The line 
with crosses shows that rings give only a surprisingly short-term defense benefit. They 
postpone network collapse from about two rounds to about a dozen rounds. Thereafter, 
the network is almost completely disconnected. 

Cliques (indicated by the caret symbol), on the other hand, work well. A few 
vertices are disconnected at each attack round, but the network itself remains robustly 
connected. This may provide some insight into why, although rings have seemed 
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Figure 25.1. Vertex order decapitation attack in rings, cliques, and with no adaptation. 
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Vertex—order and centrality attack 
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Figure 25.2. Rings and cliques defense under vertex order and centrality attacks. 


attractive to theoreticians, those real revolutionary movements that have left some 
trace in the history books have used a cell structure instead. 

We then proceeded through several rounds of attack evolution. As cliques are a 
good defense against the simple vertex-order attack, we looked for a good way to 
attack cliques. The best performer we found is an attack based on centrality. We used 
Brandes’ algorithm to select the highest-centrality nodes for destruction at each round. 
As before, our calibration baseline is random replenishment. 

Figure 25.2 shows that the same holds for rings (the squares and crosses): the 
network collapses completely after about a dozen rounds. Centrality attacks are more 
effective against cliques; they significantly reduce the size of the largest connected 
component. 

Then, knowing that centrality attacks are powerful, we tried a number of other 
possible defenses. The most promising at present appears to be a compound defense 
based on cliques and delegation. 

The idea behind delegation is simple. A node that is becoming too well-connected 
selects one of its neighbors as a “deputy” and transfers some of its links to it. This 
reflects normal human behavior even in peacetime: busy leaders pass new recruits on 
to colleagues. In wartime, and with an enemy that might resort to vertex-order attacks, 
the incentive to delegate is even greater. Thus a terrorist leader who gets an offer from 
a wealthy businessman to finance an attack might simply introduce him to a young 
militant who wants to carry one out. The leader need now maintain communications 
with at most one of the two. 

The delegation defense on its own, however, is rather like the ring defense. Network 
fragmentation is postponed (about 14 rounds with the parameters used here) though 
not ultimately averted. However, when we form a network and run the delegation 
strategy for some rounds before attacks start, then run a clique defense as well from 
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Centrality attack 
with cliques and delegation 
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Figure 25.3. Component size: clique, immunization by delegation, and combined clique and 
delegation defenses against centrality attack. 


the initiation of hostilities, this compound strategy works rather better than ordinary 
cliques. Figure 25.3 shows the simulation results. 

Delegation results in shorter path lengths under attack: it postpones and slows down 
the growth of path length that otherwise results from hub elimination. As a result, 
equilibrium is achieved later, and with a larger minimum connected component. 

Finally, we note that clique formation and delegation do not make the attacks in 
the earlier rounds of attack evolution any easier. Specifically, the effectiveness of a 
vertex-order attack depends on the skewness of the distribution of vertex order. Both 
delegation and clique formation lead to lesser skewness, and this is partly why they 
are an effective defense against a vertex-order attack in the first place. Hence these 
defensive manoeuvres will not make the earlier attacks any more effective than in the 
case where no defense actions are taken. 


25.6 Conclusion 


Information security has seen a number of interesting applications of game theory over 
the last 5 years. These have largely taken place in the context of a research program 
on the economics of security, which has built many cross-disciplinary links and has 
produced many useful (and indeed delightful) insights from unexpected places. 

We have discussed how many information security failures are caused by incentive 
failures, where the people who guard a system are not the people who suffer when it 
fails; and how externalities make many security problems somewhat reminiscent of 
environmental pollution. Some aspects of information security are public goods, like 
clean air and water. Externalities also play a key role in determining which security 
products succeed in the market, and which fail. 
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Games with incomplete information also play an important role: where either in- 
formation or action is hidden, things can go wrong in interesting ways. Markets, 
and auctions, can sometimes be used as information-processing mechanisms to tackle 
the resulting problems; we discussed software dependability and the problems of 
cyber-insurance. 

Finally we looked at effects on distributed system architectures. The designers of 
early pear-to-pear systems adopted a flat architecture, which promoted free-riding and 
made attacks easy; later, more successful, systems used a discretionary architecture that 
mitigated these problems. We now know how to analyze cooperation in heterogeneous 
distributed systems, and the tools have wider implications for understanding human 
societies. 

The second aspect of architecture is topology. Albert, Jeong, and Barabasi showed 
that scale-free networks are more robust than random networks against random failure, 
but more vulnerable to targeted attack; by extending their analysis from the static to the 
dynamic case, we have shown why revolutionaries organize in cells — and why building 
peer-to-peer systems based on rings was a bad idea. At the conceptual level, we have 
provided a framework for analyzing such problems systematically, and started to build 
a bridge between network analysis and evolutionary game theory. 


25.7 Notes 


Anderson (2001) was the first security researcher to identify the importance of incen- 
tives and economics. In earlier work he described misaligned incentives with respect to 
ATM security (Anderson, 1994) and the Eternity Service, the first peer-to-peer system 
designed to offer censorship resistance (Anderson, 1996). With Danezis, he considered 
the role of economics on censorship resistance (Danezis and Anderson, 2005). 

Varian was the first economist to pay attention to information security. He noted 
that users lacked sufficient incentive to protect themselves from viruses because much 
of the resulting harm was suffered by others (Varian, 2000). He also created a game- 
theoretic model to describe the impact of independent security decisions: whether 
system defense depended on the best effort of the defenders, on their worst effort, or 
on the sum of their efforts (Varian, 2004). 

Kunreuther and Heal extended the result to the case where the security of group 
rests upon the efforts of interdependent members (Kunreuther and Heal, 2003). Katz 
and Shapiro (1985) famously noted how network externalities affected the adoption 
of technology. Akerlof (1970) won a Nobel prize for his articulation of the effect of 
asymmetric information on markets. 

Schechter (2002) was the first to propose vulnerability markets. Ozment (2004) 
argued that those markets could be better designed as auctions. In joint work, they have 
proposed statistical measures of software security based upon software engineering 
approaches (Ozment and Schechter, 2006a). They have also analyzed the bootstrap- 
ping problems faced by those who would deploy security technologies (Ozment and 
Schechter, 2006b). 

Banking standards for PIN-entry terminals assume a cost-based analysis of vul- 
nerability (PIN management requirements, 2004). Kannan and Telang have analyzed 
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the social utility of the organizations currently purchasing software vulnerabilities and 
found it to be less than ideal (Kannan and Telang, 2004). Bohme (2006) has argued 
that software derivatives are a better tool than markets or auctions for the measurement 
of software security. With Kataria, he analysed how the interdependence of cyber-risks 
could cause insurance market failure (BOhme and Kataria, 2006). Hulisi Ogut, Nirup 
Menon, and Srinivasan Raghunathan showed that the interdependence of cyber-risk 
results in firms underinvesting in both security and insurance (Ogut et al., 2005). 

Crespo and Garcia-Molina (2002) argue for network topologies based on clubs of 
nodes with common interests. Moore (2005) has noted the security import of hidden- 
action attacks. Sparrow (1990) surveyed possible applications of social network theory 
to law enforcement in 1990; a more recent survey is by Ballester, Calv6-Armengol 
and Zenou (2004). For the debate on whether the diversity of modern societies is 
undermining the social solidarity on which welfare systems are based, see Goodhart 
(2004). 

Albert, Jeong and Barabasi (2000) showed that scale-free network topology being 
good for robustness against random failure but bad for security against targeted attack. 
Finally, Nagaraja and Anderson (2006) extended this from the static to the dynamic 
case. 
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CHAPTER 26 


Computational Aspects 
of Prediction Markets 


David M. Pennock and Rahul Sami 


Abstract 


Prediction markets (also known as information markets) are markets established to aggregate knowl- 
edge and opinions about the likelihood of future events. This chapter is intended to give an overview 
of the current research on computational aspects of these markets. We begin with a brief survey of 
prediction market research, and then give a more detailed description of models and results in three 
areas: the computational complexity of operating markets for combinatorial events; the design of 
automated market makers; and the analysis of the computational power and speed of a market as an 
aggregation tool. We conclude with a discussion of open problems and directions for future research. 


26.1 Introduction: What Is a Prediction Market? 


Consider the following mechanism design problem called the information aggrega- 
tion problem. Suppose that an individual (“the aggregator”) would like to obtain a 
prediction about an uncertain variable, say the global average temperature in 2020. 
A number of individuals (“the informants”) each hold different and nonindependent 
sets of information bearing on the outcome of the variable. The goal is to design a 
mechanism that extracts the relevant information from the informants, aggregates the 
information appropriately, and provides a collective prediction or forecast. The forecast 
should ideally be equivalent to the omniscient forecast that has direct access to all the 
information available to all informants. 

A prediction market! is one mechanism designed to solve the information aggre- 
gation problem. The aggregator creates a financial security whose payoff is tied to 
the outcome of the variable. For example, he creates a security that pays $x dollars 
if the actual global average temperature in 2020 equals x. The aggregator invites the 
informants to trade the security however they please. For example, global warming 
proponents should be willing to buy the security at or above prices equal to today’s 


' Prediction markets are also often referred to as information markets, (Arrow-Debreu) securities markets, 
contingent claims, contingent contracts, event markets, event futures, event derivatives, and idea futures. 
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global average temperature, and global warming skeptics should be willing to sell at 
those prices.” The aggregator can view the trading price of the security as a collective 
forecast for the expected value of the uncertain variable. In fact, as we shall see in Sec- 
tion 26.2.2.3, in some simplified theoretical settings one can prove that the trading price 
converges to a rational expectations equilibrium that mimics the omniscient forecast. 

More importantly, in a broad and diverse number of real-world settings in the lab- 
oratory, in the field, and in practice, prediction markets seem to yield equal or better 
forecasts than other methods of information aggregation. Researchers have proposed 
using prediction markets to help scientists, policymakers, decision makers, the gov- 
ernment, and the military. Several companies — from established brands like Google, 
Microsoft, and Yahoo! to startups like CrowdIQ, InklingMarkets, and NewsFutures — 
are experimenting with prediction market services in the private sector. The growth of 
the field is reflected and fueled by a wave of popular press articles and books on the 
topic, most prominently Surowiecki’s “The Wisdom of Crowds.” 

In this chapter, we focus on algorithmic challenges and constraints associated with 
implementing a prediction market mechanism. We discuss three areas in which com- 
putational constraints are important. 


¢ Effective prediction markets often need to handle combinations of different events or 
contingent events. However, the number of contingent events grows exponentially in 
the number of base events. In this situation, the basic functions of listing securities and 
clearing markets can become computationally intractable. In Section 26.3, we present 
results on the computational complexity of operating combinatorial markets. 

¢ To increase trading volume, a prediction market operator often acts as a market maker 
who is always ready to trade. However, To limit the exposure of the market maker, it 
is essential that the market maker adjusts its bid and ask prices after every trade. In 
Section 26.4, we describe two new designs to automate the price updating process in a 
way that limits exposure while encouraging informed traders to trade. 

¢ When different traders have complementary information about the value of a security, the 
market itself ideally performs a computational function: The final trading price should 
reflect an aggregate of all the traders’ initial information. In Section 26.5, we present a 
simple market model and analyze its computational properties. We derive positive and 
negative results on when the market will converge to the ideal price, as well as bounds 
on a measure of convergence time. 


In Section 26.2, we set up the problem formally and survey the academic literature 
on prediction markets. 


26.2 Background 


26.2.1 Setup and Notation 


In this section we formally pose the aggregation problem that prediction markets are 
designed to address. We begin by introducing a fairly standard model of uncertainty 
and distributed information. 


2 For simplicity, we ignore the time value of money. 
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Definition 26.1 Partition model of knowledge: There is a set Q of possible 
states of the world. At any point of time, the world is in exactly one state w € Q, 
but agents do not necessarily know the true state of the world. However, each 
agent i may have partial information about the true state. Agent i’s information 
is represented by a partition 1; of Q; that is, 77; is a collection {7j1, 72, ..., Tix} 
of subsets of Q such that the different subsets are disjoint and the union of all 
subsets is Q. The semantic interpretation is that i can distinguish two states in 
different subsets 71, 2; of her partition z;, but cannot distinguish between two 
states in the same subset of the partition. In particular, agent i knows in which 
subset of her partition the true state of the world lies, but does not know which 
member of that subset is the true state. Given n agents 1, 2, ..., 1, their combined 
information 7 is the coarsest common refinement of the partitions 77, 12, ..., Xp. 


The partition model is often augmented with the assumption that there is a common 
prior probability distribution P € A(&2), which captures the probability that all agents 
assign to different states before receiving any information. Once agents obtain their 
partial information, their posterior beliefs follow by conditioning on their information 
— that is, by restricting prior to the subset of their partition in which the true state 
lies. 

A forecast is an estimate of the expected value of some function f(@), where f 
is a commonly known (deterministic or stochastic) function of the state of the world. 
A special type of function f : Q — {0, 1} called an event equals one for a particular 
subset of Q and zero everywhere else. A joint forecast is a joint probability distribution 
over the values of a number of functions f\(@), fo(@), .... 


Figure 26.1. Partition model of knowledge. In this example, the set Q of states of the world 
contains eight mutually exclusive and exhaustive states: w), @2,..., @s. Subsets of states like 
X,, X2, and X3 are called events. Suppose that agent i can distinguish between states in X; and 
states not in X;, but cannot further distinguish among states. For example, agent 1’s partition 7 
is {{@1, @2, @3, 4}, {W5, 6, W7, Ws}}. In this simple example, the coarsest common refinement 
of the three agents’ partitions is 7 = @, meaning that the agents’ combined information is 
always sufficient to precisely identify the true state. Often, we may consider the events X; as 
the most basic elements of the model, with the @ being the implied product space of these 
base event outcomes. For example, 4 in the figure is explicitly indexed as wy, y, x,: the future 
state where X;, is true, X> is false, and X3 is true. 
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On its own, an agent’s best forecast uses its posterior distribution over (2, but ignores 
information that might be obtained via interaction with other agents. The omniscient 
forecast uses the posterior distribution conditioned on all information available to all 
agents, or P restricted to the subset of 7% in which the true state lies. 

Inreality, each agent’s information is private knowledge that is not directly accessible 
to any one entity. Thus information aggregation is a problem of mechanism design (see 
Chapter 9). The goal is to produce a mechanism that incentivizes the agents to reveal 
their information such that, in equilibrium, the mechanism produces a forecast as close 
as possible to the omniscient forecast. 

A prediction market is one type of information aggregation mechanism. The market 
contains financial securities whose payoffs are functions of the state of the world. In 
the simplest case, the market contains a security paying off f(@) dollars in state w. 
Thus agents are incented through the prospect of financial gain to reveal informa- 
tion bearing on the expected value of f(w), and the equilibrium price reached by a 
number of interacting agents can be viewed as a collective forecast. As we shall see in 
Section 26.5, even when a single forecast is sought, multiple securities might be required 
to ensure convergence to equilibrium. In Section 26.3 we explore the computationally 
challenging case of setting up a market to yield a joint forecast. 


26.2.2 Survey of the Field 


The field of prediction markets is largely an empirical science, and much of the 
academic literature focuses on laboratory and field experiments testing the accuracy of 
predictions in a variety of settings. However, a prediction market is operationally no 
different than a standard financial market, so a large amount of economic and financial 
theory applies. 


26.2.2.1 What and How: Instruments and Mechanisms 


A prediction market can be designed to elicit a forecast for any type of random variable 
or set of variables. For example, the variable can be binary (“will a Republican win 
the next US Presidential election?’’), discrete (“who will win the next US Presidential 
election? A Democrat, a Republican, or someone else?”), continuous (“what will the 
global average temperature be in 2020?”), or a joint space of any combination of the 
above. 

Beyond “what” is being traded, there are a variety of different mechanisms specify- 
ing “how” the securities are traded, including a call market auction, continuous double 
auction, continuous double auction with market maker, bookmaker, parimutuel market, 
and combinatorial versions of the above, all of which have some empirical record of 
success. 

In a call market auction, all bids are collected over time, then processed together 
in large batches. The clearing price can be the mth lowest price, the m + 1st lowest 
price, or somewhere in between, where m is the number of sellers. A continuous 
double auction is a continuous version of a call market, where as soon as any trade 
is acceptable to any two bidders, the trade is immediately executed, usually at the bid 
price of the least recent bidder. A market maker or bookmaker is a price maker who is 
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nearly always willing to accept both buy and sell orders at some stated (but changing) 
prices. In a parimutuel market, players compete in a wagering game to earn as large a 
portion as possible of the total amount of money wagered by all players. 


26.2.2.2 Examples and Evaluations 


A prediction market cannot surface information that does not exist or is unknown, 
so the accuracy of a prediction market can only be evaluated in comparison to other 
information aggregation or forecasting methods. The central empirical question is 
whether a prediction market aggregates or summarizes information more accurately 
than other methods. 

One of the most cited and most successful prediction markets is the lowa Electronic 
Market (IEM). Since 1988, IEM has been operating real-money prediction markets, 
mostly on the outcomes of political elections. Empirically, on average the market’s 
predictions are more accurate and less volatile than political opinion polls, especially 
in large US elections. The markets react to new information quickly, sometimes within 
minutes, and often before the new information becomes widespread. The markets are 
accurate despite documented evidence that individual traders are often biased and 
irrational and make mistakes. Several IEM publications support a theory that accuracy 
derives not from average traders, but from marginal traders. Marginal traders are more 
active, less biased, more successful, and price makers rather than price takers. As long 
as a few good marginal traders exist, the market as a whole remains accurate despite 
the poor traders. 

Options, futures, and other financial derivatives are contracts whose payoff is a 
function of some underlying uncertain variable. For example, the payoff of a stock 
option with strike price k is max[0, s — k], where s is the price of the corresponding 
stock at some future date. Sports betting markets can also be viewed and analyzed as 
prediction markets. Several empirical studies verify that derivative prices and sports 
betting odds constitute accurate forecasts for their underlying variables. 

Even play-money markets show a surprising ability to aggregate information. Studies 
of market games like the Hollywood Stock Exchange, NewsFutures, and the Foresight 
Exchange report accuracies equal to or better than expert opinions and, remarkably, 
sometimes on par with equivalent real-money prediction markets. 

Experimental economists have tested the aggregation properties of prediction mar- 
kets in laboratory settings. The experimenter sets up the forecasting problem and 
carefully controls the information each participant receives. A number of experimental 
designs reveal when market aggregation seems to work and when it does not. Generally, 
given enough securities and enough practice, traders in the laboratory often converge to 
prices close to the omniscient forecasts. Researchers have devised and tested methods 
for achieving accurate results across as many forecast variables as possible with as few 
participants as possible. 

Economists have also run field tests of markets used to forecast quantities of interest 
to an organization. For example, a market was tested at Hewlett Packard to project 
the company’s sales volume for particular products. Generally, the market predictions 
were superior to the official HP forecasts. Other companies, including Microsoft and 
Google, are now running similar internal prediction markets. 
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26.2.2.3 Theoretical Underpinnings 


There is a fundamental difference between a market for a financial security and a market 
for a consumer product: the security has no direct consumption value to potential 
buyers. Buyers want to buy the security only because they believe they can later resell 
it or cash it out for a higher price. This simple observation invalidates the classical 
model of demand, in which each trader has a fixed demand curve that describes the 
quantity demanded at each price. The market provides information about other traders’ 
knowledge and beliefs, which may lead a trader to change her beliefs about the future 
value of the security. In this manner, the market prices can lead to changes in the traders’ 
demand curves! This led to the development of a new theory, the theory of rational 
expectations, that seeks to understand this latter kind of market. The cornerstone of this 
theory is a new equilibrium concept, the rational expectations equilibrium. Intuitively, 
a rational expectation equilibrium price is a market-clearing price such that traders will 
not want to change their trades even after observing the price itself. 


Rational expectations Consider the model of Section 26.2.1: an uncertain world with 
possible states Q, and n traders trading in a market for some good. Let v;(q;, w) 
denote the ultimate value of g; units of the good to trader i in state w. The traders are 
partially informed: let z; denote trader i’s private information, and assume that there 
is a common prior distribution ?. Furthermore, we assume that all traders are risk- 
neutral Bayesians. To simplify the exposition, we consider the special case in which 
the # = Q, so the combined information of all agents is sufficient to pinpoint the true 
state. The equilibrium price is not a simple number as in the case of the competitive 
equilibrium; instead, it is a mapping P* :  — % that maps a state of the world to a 
price. 


Definition 26.2 A rational expectations equilibrium is a mapping P* : Q > & 
such that in every state w, if every trader conditions her demand (or supply) on 
her private information z; as well as the price P*(@), the market will clear at a 
price of exactly P*(w). In other words, it is a self-fulfilling correspondence from 
states to prices. 


This definition is subtle, and needs to be reasoned through carefully. Consider an 
arbitrary nonconstant mapping P from states to prices. Then, by observing the price 
P(q@), an agent who knew the mapping could immediately rule out some states of the 
world: those that would have resulted in a different price. Thus, any mapping P induces 
a partition zp such that anyone observing P(@) knows zp in addition to her initial 
information. Now, trader i’s effective demand curve in state w will be given by her 
expected value for the item conditioned on both the price and her private information: 
0i(q;, @) = E[v(q;, @)|7;(w), P(@)]. Given the demand and supply curves for the n 
agents, it is possible to calculate a clearing price 0({2). The price mapping P would 
be a rational expectations equilibrium iff 0(@) = P(@) for all w. In other words, it is 
rational for the agents to believe in a price mapping P only if all agents believing in 
that mapping and acting accordingly would lead to the prices predicted by P. 
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Researchers have shown the existence of rational expectations equilibria in 
economies with asymmetric information under fairly general conditions on the value 
functions v;(-, -). Furthermore, it has been shown that under generic conditions, these 
economies admit “fully-revealing” rational expectations equilibria: price correspon- 
dences P*(-) such that P*(w,) 4 P*(@2) whenever w, 4 @ . In this case, it follows 
that the price reveals the combined information of all traders, i.e., 7p» = #, the full- 
information partition. This leads to startling, and sometimes counterintuitive, conse- 
quences; we discuss some of these in subsequent sections. We note, however, that the 
rational expectations literature has been criticized because the definition of a rational 
expectations equilibrium says nothing about how traders might learn and agree on the 
equilibrium price mapping P*. In applying this concept, it is important to keep this in 
mind, and take the price formation process into account when possible. 


Efficient market hypothesis and no-trade theorems. The existence of fully reveal- 
ing equilibria has led researchers to propose the “efficient market hypothesis.” The 
strong form of this hypothesis states that a security’s market price fully reflects all 
the information relevant to its value. The efficient market hypothesis, with its roots 
in rational expectations theory, provides a theoretical foundation for why prediction 
markets are likely to be effective: In a situation in which many traders have a small 
amount of private information about an event, it states that the prediction market price 
will reflect the combined information of all traders. 

One of the most counterintuitive results of rational expectations theory is the exis- 
tence of no-trade theorems. The key observation is that, in a fully revealing rational 
expectations equilibrium, the price information captures every agent’s private infor- 
mation. Thus, in a fully revealing equilibrium, all agents are conditioning their beliefs 
on identical information, and hence have identical posterior beliefs. It follows that all 
agents assign the same expected value to the security, and hence, there will not be any 
trade in equilibrium. This reasoning can be extended to show that no two rational agents 
will want to trade with each other even if they are not initially in equilibrium, because 
the mere willingness of the other party to trade at a given price reveals information that 
leads to an equilibrium. Several variants of this result, under different conditions, have 
been shown. 

Thus, we seem to have a paradoxical situation in which the final price reflects all 
the traders’ information, but the traders would never want to trade so there is no way 
for their information to get into the prices! However, the no-trade results are very 
sensitive to the precise conditions specified — risk-neutrality and common knowledge 
that all traders are competely rational Bayesians — and even tiny perturbations of 
these conditions invalidate them. In practice, there are several reasons that can lead an 
informed trader to expect a profit from trade, such as the existence of irrational traders, 
traders who are trading to hedge risks, traders who trade for liquidity reasons, or a 
market maker who is subsidizing the market. 


26.3 Combinatorial Prediction Markets 


Up to this point, we have concentrated on the economic, strategic, and statistical 
properties of prediction markets. We now turn our attention to the computational 
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problems that arise in the study of prediction markets. In this section, we consider 
combinatorial markets. These are markets in which the state space is the product space 
of a number of base events. Here, we consider state spaces generated by Boolean 
events: propositions such as, “the price of gasoline is greater than $3” that may be 
either true or false in the future world. Suppose that there is some finite set € of base 
events, and furthermore, suppose that these events are linearly independent in the sense 
that the value (true or false) of any event cannot be determined with certainty even 
if the value of all other events is known. Then, the state space Q is of size 2'€, with 
each state corresponding to a particular assignment of values to the individual events. 
We use the symbols X,, X2, X3, ... to denote the individual Boolean events in €. 

Let S,, be a security that pays $1 if the eventual state is w, and pays $0 otherwise. 
Classic results on market equilibrium show that a market can be guaranteed to be 
efficient if it is possible for a trader to express her desire for any such S,,. This does 
not necessarily mean that the securities S,, have to be directly traded in the market, as 
long as the market has a set of securities such that a trader could construct a portfolio 
with payoff similar to any S,, she desires. Such a market is called a complete market. 
Unfortunately, any complete market must have at least 2'*! securities; if the number of 
base events is large, even listing all the securities may be impossible! 

However, this does not mean that it is impossible to achieve efficient hedging or 
information aggregation in practice. There may be many fewer than 2!*! combinations 
of events that traders actually care about, or have specific information about. This 
raises the following questions: (1) Is there a “natural” representation such that realistic 
events, securities, and buy/sell orders can be represented succintly? (2) Given orders 
in this representation, is it possible to identify and execute possible trades? 

The underlying structure of the state space can be exploited through the use of 
prediction markets with expressive bidding languages. We distinguish between two 
forms of expressivity: combined orders and compound orders. 

A combined order allows the trader to specify a collection of securities he or she 
would like to trade together as a bundle, with limit prices specified for each component 
security. If the trader cannot obtain all of the securities at prices equal to or better than 
the specified limits, then the trader prefers not to receive any of the securities. This form 
of expressivity reduces so-called execution risk, where during the course of carrying out 
a planned series of transactions, the prices of some securities change, thereby reducing 
or reversing the utility of the earlier trades. If there are |€| Boolean event securities, 
then traders can place a combined order for any of the 2! possible bundles (subsets) 
of the securities. When combined orders are allowed, the auctioneer problem is essen- 
tially the same as in the combinatorial auction scenario (see Chapter 11). One distinction 
is that, while bids in combinatorial auctions are generally considered indivisible, bids 
in a securities market often can be considered divisible, thus simplifying the matching 
problem. The auctioneer problem of matching combined orders in a securities market 
is also called combined value trading. 

A compound order allows the trader to speculate on any compound Boolean expres- 
sion involving a set € of base events. If there are |E| base events, then there are 2!*! 
possible combinations of outcomes of those events, and there are 27" distinct subsets 
of those combinations expressible using Boolean formulas. For the remainder of this 
section, we will focus on compound orders, a strict superset of combined orders. 
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26.3.1 Compound Prediction Markets 


We now describe a concrete representation for compound order securities. The secu- 
rities are based on Boolean formulas over the set of propositions €; given a formula 
@, we have a security that pays $1 iff @ is true in the eventual state. More generally, 
we allow conditional securities S¢jy based on two formulas ¢, w; this is interpreted 
as “Make a payoff according to ¢, conditional on w being true.” In other words, the 
owner of security Sg\y is paid $1 if both ¢ and y are true, paid $0 if w is true but ¢ is 
false, and the security is cancelled (and any money the owner paid for it is refunded) 
iff y is false. 


26.3.1.1 Orders 


Agents place orders, denoted o, of the form “q units of Syjy at price p per unit,” 
where gq > O implies a buy order and q < 0 implies a sell order. We assume that agents 
submitting buy (sell) orders will accept any price p* < p (p* > p). We distinguish 
between divisible and indivisible orders. Agents submitting divisible orders will accept 
quantity ag for any 0 < a < 1. Agents submitting indivisible orders will accept only 
exactly q units, or none. 

Every order o can be translated into a payoff vector Y across all states w € Q. 
The payoff Y) in state w is g - lwey(lweg — p), where 1 ez is the indicator function 
equaling | iff @ € E and zero otherwise. Let the set of all orders be O = {o;} and the 
set of corresponding payoff vectors be P = {Y;}. 


26.3.1.2 The Matching Problem 


The auctioneer’s task, called the matching problem, is to determine which orders to 
accept among all orders o € O. Let a; be the fraction of order 0; accepted by the 
auctioneer (in the indivisible case, w; must be either 0 or 1; in the divisible case, a; can 
range from 0 to 1). If w; = 0, then order 0; is considered rejected and no transactions 
take place concerning this order. For accepted orders (@; > 0), the auctioneer receives 
the money lost by bidders and pays out the money won by bidders, so the auctioneer’ s 
payoff vector (or surplus vector) is 


Taue = » —a; Yj. 


YieP 


Assume that the auctioneer wants to choose a set of orders so that it is guaranteed 
not to lose any money in any future state, but that the auctioneer does not necessarily 
insist on obtaining a positive benefit from the transaction (i.e., the auctioneer is content 
to break even). 


Definition 26.3 (Indivisible matching problem) Given a set of orders O, 
does there exist a; € {0, 1} with at least one a; = 1 such that Vw, poe > 0? In 
other words, does there exist a nonempty subset of orders that the auctioneer can 
accept without risk? 
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Example 26.4 (Indivisible order matching) Suppose |€| = 2. Consider an 
order to buy one unit of Sy,x, at price 0.4 and an order to sell one unit of Sx, at 
price 0.3. The corresponding payoff vectors are 


Y= i ama Deen ap i%%) 7p iXi%) 
= (0.6, -0.4, -0.4,  —0.4) 
YT, = (0.7, =07) 0.3, 0.3) 


The auctioneer’s payoff vector (the negative of the component-wise sum of the 
above two vectors) is 


Yauc = —7T1 = Yo = (0.1, 1.1,0.1, 0.1). 


Since all components are nonnegative, the two orders match. The auctioneer can 
process both orders, leaving a surplus of $0.1 in cash and one unit of Sy, y, in 
securities. 


Now consider the divisible case, where order can be partially filled. 


Definition 26.5 (Divisible matching problem) Given a set of orders O, does 
there exist a; € [0, 1] with at least one a; > O such that Vw, ye > 0? 


The matching problems defined above are decision problems: the task is only to show 
the existence or nonexistence of a match. We could additionally seek to maximize some 
objective function — like trading volume or auctioneer expected profit — to choose the 
best among all possible matches. Here, we restrict our attention to the decision problem 
formulations. 


26.3.1.3 The Computational Complexity of Matching 


In this section we examine the computational complexity of the auctioneer’s matching 
problem. Here n is the size of the problem’s input, including descriptions of all the 
buy and sell orders. We also assume that n bounds the number of base securities. We 
consider four cases based on two parameters: 


(i) Whether to allow divisible or indivisible orders. 

(ii) The number of securities. We consider two possibilities: (a) OUogn) base securi- 
ties yielding a polynomial number of states, or (b) ©(7) base securities yielding an 
exponential number of states. 


Theorem 26.6 The matching problem for divisible orders is 
(i) computable in polynomial-time for O(log n) base securities. 


(ii) co-NP-complete for unlimited securities. 


PROOF Small number of securities with divisible orders. We can build a 
linear program based on Definition 26.5. We have variables a;. For each i, we 


COMBINATORIAL PREDICTION MARKETS 661 


have 0 < a; < 1. and for each state w in Q we have the constraint 


Payment(w) = 2 —a; ara > 0. 


Given these constraints, we maximize )°,a@;. A set of orders has a matching 
exactly when max )°; a; > 0. With O(log) base securities, we can solve this 
linear program in polynomial time. Note, however, that this approach may not 
find matchings that have precisely zero surplus. 

Large number of securities with divisible orders. With unlimited base securi- 
ties, the linear program given in Section 26.3.1.3 has an exponential number of 
constraint equations. Each constraint is short to describe and easily computable 
given w. Let m <n be the total number of buy and sell orders. By the theory of 
linear programming, an upper bound on the objective function can be forced by a 
collection of m + 1 constraints. So if no matching exists there must exist m + 1 
constraints that force all the a; to zero. In nondeterministic polynomial-time we 
can guess these constraints and solve the reduced linear program. This shows that 
matching is in co-NP. 

To show co-NP-completeness, we reduce the NP-complete problem of Boolean 
formula satisfiability to the nonexistence of a matching. Fix a formula ¢. Let the 
base securities be the variables of @ and consider the single security S4 with a buy 
order of 0.5. If the formula ¢ is satisfiable, then there is some state with payoff 
0.5 (auctioneer payoff —0.5) and no fractional unit of security Sy is a matching. 
If the formula @ is not satisfiable then every state has an auctioneer’s payoff of 
0.5 and a single unit of Sy is a matching. 


For indivisible orders, the matching problem turns out to be even harder to solve. 
We state the following result; because of space restrictions, we do not reproduce the 
proof here. 


Theorem 26.7 The matching problem for indivisible orders is 
(i) NP-complete for O(log n) base securities. 


(ii) &}-complete for unlimited securities. 


26.3.2 Compact Prediction Markets 


Compound orders are very general: traders can submit orders for any Boolean expres- 
sion of base events. Computational limits aside, a market system supporting compound 
orders effectively implements a complete securities market, as defined above, mean- 
ing that all possible mutually agreeable transactions can proceed, supporting a Pareto 
optimal and economically efficient allocation of securities. 

Of course, computational limits are a real practical barrier; matching compound 
orders can easily become intractable. By limiting the full expressivity of compound 
orders, computational complexity can be reduced. 

One natural restriction takes advantage of any (conditional) independence relation- 
ships among base events. Suppose that the statistical dependency structure of the base 
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events is encoded as a Bayesian network. That is, the joint probability distribution over 
the base events can be factored as follows: 


lE| 


Pr(X1X2... Xie) = ] [Pre | pa(X;)), 
k=1 


where pa(X;) is a set of base events with index less than k called X;’s parents. The 
factorization can be depicted as a directed acyclic graph with nodes representing base 
events and edges from each event in pa(X;) to X;, representing direct conditional 
dependencies. 

Now restrict trading to conditional securities of the form Sy \pacx;), one for each 
conditional probability Pr(X ;|pa(X;)) in the Bayesian network. Each event X; with 
|pa(X ;)| parents corresponds to 2'P4*)) securities, one for each possible combination 
of outcomes of events in pa(X;). A securities market structured in this way contains 
O(\E| - 2™2*'Pa(X})!) securities, which can be considerably fewer than the 2'°! securities 
required for a complete market, if max |pa(X;)| « |€|. Call such a market a BN- 
structured market. 

Although the need for 2'°! securities cannot be relaxed if one wants to guarantee 
completeness in all circumstances, there are some restrictive conditions under which 
a smaller BN-structured securities market may be operationally complete, meaning 
that its equilibrium is Pareto optimal with respect to the traders involved. In particular, 
if all traders’ risk-neutral independencies agree with the independencies encoded in 
the market structure, then the market is operationally complete. For collections of 
agents all with constant absolute risk aversion (negative exponential utility for money), 
agreement on Markov independencies is sufficient for operational completeness. 


26.4 Automated Market Makers 


The standard way to organize a market is as a continuous double auction, in which 
traders arrive asynchronously and place their orders, and a trade takes place if a buyer 
quotes a higher price than a seller who is present at the same time. In a prediction 
market organized in this way, a speculator with private information about the security 
would have to submit her order and wait for another trader to place a matching order. 

There are two problems with this scenario. First, the informed trader may not 
be willing to wait indefinitely for a partner to trade with. If there are few potential 
traders, they may never even enter the market because they do not expect to find a 
trading partner. This is the thin market problem: a “chicken and egg” scenario where 
few traders care to participate because other traders are scarce, leading to a potential 
breakdown of the market. The thin market problem can be especially severe in a 
combinatorial market because each trader’s attention is divided among an exponential 
number of choices, making the likelihood of a match between traders seem very remote. 
Second, an informed trader may not want to reveal her willingness to trade (at a given 
price), because this may tip off other traders, and may prevent her from making a 
profit. This effect is related to the no-trade theorems discussed in Section 26.2.2.3, 
and arises because traders are essentially playing a zero-sum game with each other. 
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Both problems can reduce the incentives for traders to participate, thus reducing the 
informativeness of prices. 

An alternative to using a double auction mechanism is for the market to include a 
market maker. A market maker is an agent who is always ready to trade. Typically, a 
market maker posts bid and ask prices (which may be identical); then a seller who is 
willing to sell at the bid price (or a buyer who is willing to pay the ask price) can trade 
with the market maker. The market maker may later resell the securities it bought to a 
buyer. In this way, the market maker can effectively engineer a trade between a buyer 
and a seller who arrive at different times and do not wait. 

Of course, one side effect of having a market maker is that the market operator could 
potentially make a loss. This is not necessarily a negative property; in essence, it is a 
way of injecting subsidies into the market. The no-trade theorems no longer apply to 
a market with subsidies, so informed speculators can rationally expect to profit from 
their trade. However, it is important that the loss be predictable or bounded. To achieve 
this, the bid and ask prices must be adjusted in a systematic way after every trade; the 
new prices are computed by an automated market maker. 

An ideal automated market maker should satisfy three properties: (1) it should run 
a predictable or bounded loss; (2) informed traders should have an incentive to trade 
whenever their information would change the price; and (3) after any trade, computing 
the new prices should be a tractable problem. In this section, we describe two new 
microstructures for prediction markets that effectively function as automated market 
makers, and appear to have all these properties. 


26.4.1 Market Scoring Rules 


Hanson shows how any proper scoring rule, or payment scheme designed to elicit 
truthful reporting of probabilities, can be converted into an automated market maker. 
The market maker can be thought of as a sequential shared version of the scoring 
tule, as we describe later. First, we describe the market maker algorithm in a more 
conventional light. 

Suppose that the market contains |Q2| mutually exclusive and exhaustive securities. 
Let g; be the total quantity of security j held by all traders combined, and let g be the 
vector of all quantities held. The market maker utilizes a cost function C(q) that records 
the total amount of money traders have spent as a function of the total number of shares 
held of each security. A trader who wants to purchase 6 shares of security j must pay 
C(qi,---.4; + 6,---, Ga)) — C(q) dollars. More generally, a trader who wants to buy 
or sell any bundle of securities (i.e., any combined order or compound order, as defined 
in Section 26.3) such that the total number of outstanding shares changes from doug 
tO Gnew Must pay C(Gnew) — C(oia) dollars. Negative quantities encode sell orders and 
negative “payments” encode sale proceeds earned by the trader. At any time, the going 
price of security j is 0C /dq;, the cost per share for purchasing an infinitesimal quantity. 
The full cost for purchasing any finite quantity is the integral of price evaluated from 
old tO Grew, OF C(Gnew) — C (Gola). Once the true outcome becomes known, the market 
maker pays $1 per share to traders holding the winning security. 
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Deriving the cost function associated with a particular scoring rule is straightforward 
if tedious. The cost function corresponding to the logarithmic scoring rule is 


CG) = bin| 5) ev!” 
J 
and the price function is 9C/q; = e4//"/ )~, e#/". The free parameter b controls both 
the market maker’s risk of loss and the effective liquidity of the market. One can show 
that the maximum possible loss incurred by the maker maker is b In |Q|. But a larger 
b also means that more shares can be purchased at or near the current price without 
driving up the price too much, a measure of market liquidity and depth. The logarithmic 
scoring rule market maker has been implemented in several real-world settings with 
success, including at InklingMarkets, Net Exchange, and Microsoft. 
The cost function corresponding to the quadratic scoring rule is 
CG) = 4 | ae 41) b 
|Q| 4b 4b|Q| |Q| 
The quadratic scoring rule market maker is likely not of much practical interest. The 
market maker allows traders only to buy a small fixed number of shares of any security. 
Moreover, as soon as one upper limit is reached on any security, the market maker 
cannot accept buy orders for other securities. In contrast, the logarithmic scoring rule 
market maker can accept arbitrarily large quantities of buy or sell orders. 

As mentioned, a market scoring rule market maker can be viewed as a sequential 
shared version of a scoring rule. Conceptually, the market maker begins by setting 
prices equal to an initial probability estimate. The first trader to arrive agrees to (1) 
pay the market maker the scoring rule payment associated with the market maker’s 
probability estimate and (2) receive the scoring rule payment associated with the 
trader’s own probability estimate. Myopically, this modified scoring rule still incents 
the trader to reveal her true probability estimate. The final trader pays the scoring 
tule payment owed to the second-to-last trader and receives a scoring rule payment 
from the market maker. The market maker’s loss is bounded by the maximum possible 
payment to the final trader minus the payment from the first trader. One can show that 
the more conventional cost function formulation of the market maker is equivalent to 
the sequential shared scoring rule formulation. 


26.4.2 Dynamic Parimutuel Markets 


A parimutuel game is a wagering game where players compete to earn as large a portion 
as possible of the total pool of money wagered by all players. Again consider a set Q of 
mutually exclusive and exhaustive outcomes. Players wagers money on the outcome(s) 
of their choice. When the true outcome is revealed, players who wagered on the correct 
outcome split the total pool of money in proportion to the amount they bet. In a sense, 
the cost of purchasing an equal share of the winnings associated with any outcome 
is always a constant, say $1. A dynamic parimutuel market is a dynamic-cost variant 
of the parimutuel wagering game. As before, traders compete for a share of the total 
money wagered, however the cost of a single share varies dynamically according to 
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a cost function, thus allowing traders to sell their shares prior to the determination of 
the outcome for profits or losses. From a trader’s perpective, the mechanism acts as a 
market maker. 

A particularly natural cost function is the share-ratio cost function, which equates 
the ratio of prices of any two outcomes with the ratio of number of shares outstanding 
for the two outcomes. The share-ratio cost function is 


c@=« |og. 
J 


where « is a free parameter. The corresponding price function is pj = Kqj/,/ >-, ;- 
This cost function is the unique dynamic parimutuel cost function satisfying the ratio 
constraint p;/px = 9;/qx for all j and k. Setting « = 1 yields a natural version where 
the price of each outcome is always less than 1, and the payoff per share of each 
outcome is always greater than 1. The share-ratio cost function is arbitrage-free and 
ensures that wagers on the correct outcome can never lose money. The market maker 
initiates the game with an allocation of shares gq and a corresponding C(q) dollars, 
reflecting the market maker’s maximum risk of loss. 

Besides the different form of the cost function, the main difference between a market 
scoring rule market maker and a dynamic pari-mutuel market maker is that the former 
pays a fixed $1 per share to winning shareholders while the latter pays an equal portion 
of the total amount wagered to winning shareholders. Because of the added uncertainty 
surrounding the payoff per share, trading strategies in a dynamic parimutuel market 
are more complicated, and the interpretation of the price as a forecast is less direct. On 
the other hand, as a gambling game, the added uncertainty may appeal to risk seeking 
traders. 


26.5 Distributed Computation through Markets 


Sections 26.3 and 26.4 concerned algorithmic components of the operation of a pre- 
diction market. In this section, we turn that viewpoint inside out, and study the system 
of market and traders as a computational device (that is perhaps a part of a larger com- 
putation)! We construct and analyze a simple model of a prediction market in order to 
gain insight into two fundamental properties of any computational device: what can it 
compute? and, how fast does the computation run? 

Where is this computation taking place? The traders use their private information 
to attempt to make profitable trades. Importantly, they observe the market clearing 
price (or the actual sequence of trades), and update their beliefs about the security 
value. The computation of the market as a whole occurs through the traders’ belief- 
updating processes; this is where a trader takes a signal (the market price) that reflects 
some information about other traders, and combines it logically with her own private 
information. 

The process by which the market prices adjust is important for another reason: 
Recall from Section 26.2.2.3 that the rational expectations equilibrium definition does 
not address the issue of how traders reach the equilibrium price correspondence. We 
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shall see that this can be problematic: With a plausible belief-updating process, the 
market prices may sometimes get stuck at a noninformative equilibrium, even though a 
fully revealing equilibrium exists. Thus, we need a better understanding of the dynamics 
of the price adjustment process. The following model provides some insight. 


26.5.1 Boolean Market Model 


We model a very simple class of elementary computation problems — computing a 
Boolean function — and study what can be computed with a single security. Initially, 
suppose that there are n traders, each with a single bit x; of private information; we 
use x to denote the vector (x,,...,x,). This model can be translated to a partition 
model as described in Section 26.2.1: The state space is Q = {0, 1}", and each agent i 
initially has a partition m7; = {{x € Q|x; = 0}, {x € Q|x; = 1}} with two components. 
We are interested in learning the value of a Boolean function f : {0, 1}” — {0, 1} of 
the combined information x. To do this, we set up a market in a security F that will 
pay $1 if f(x) is ultimately revealed to be 1, and $0 otherwise. The form of f (the 
description of the security) is common knowledge among agents. We sometimes refer 
to the x; as the input bits. At some time in the future after trading is completed, the 
true value of f(x) is revealed. Note that the traders’ combined information is enough 
to determine the exact value of f(x); thus, if the market is truly efficient, we expect its 
equilibrium trading price to be equal to f(x). 

To have a model that permits analysis, we next need to specify how the market prices 
are formed, and how the agents bid in the market and react to market information. 


26.5.2 Bid Format and Price Formation 


Continuous double auctions are complex systems, and there is no standard way to 
analytically model the price formation process; we use the following linear model that 
loosely captures the nature of the market, and permits analysis. The market proceeds 
in synchronous rounds. In each round, each agent i submits a bid b; and a quantity 
qi. The semantics are that agent i is supplying a quantity g; of the security and an 
amount b; of money to be traded in the market. For simplicity, we assume that there 
are no restrictions on credit or short sales, and so an agent’s trade is not constrained 
by her possessions. The market clears in each round by settling at a single price that 
balances the trade in that round: The clearing price is p = )), b;/ >; gi. At the end of 
the round, agent i holds a quantity g proportional to the money she bid: g/ = b;/p. In 
addition, she is left with an amount of money D* that reflects her net trade at price p: 
b, = b; — p(q} — 4i) = pqi. Note that agent i’s net trade in the security is a purchase 
if p < b;/q; anda sale if p > b;/qj. 

After each round, the clearing price p is publicly revealed. Agents then revise 
their beliefs according to any information garnered from the new price. The next 
round proceeds as the previous. The process continues until an equilibrium is reached, 
meaning that prices and bids do not change from one round to the next. 

Here, we make a further simplifying restriction on the trading in each round: We 
assume that g; = 1 for each agent i. This serves two analytical functions: First, it 
forces trade to occur. Our model has only rational, risk-neutral, informed traders, and 
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the classic no-trade results would apply. As we have seen, there are several reasons 
why rational traders would want to trade in practice (subsidies, insurance traders, 
etc.). This forced trade assumption allows us to capture this practical fact without the 
complications of explicitly modeling these reasons. Second, the fact that agents know 
the volume of other agents’ trades improves their ability to learn from prices. This 
perhaps gives our agents too much power; but as we shall see, there are still situations 
in which the market does not converge to the correct value. 


26.5.3 Agent Behavior 


We assume that agents are risk-neutral, myopic,* and bid truthfully: Each agent in each 
round bids his or her current valuation of the security, which is that agent’s estimation 
of the expected payoff of the security. Expectations are computed according to each 
agent’s probability distribution. We assume that there is a common prior probability 
distribution P over values of x shared by all agents; the agents use their private 
information and the observed prices to update their beliefs via Bayes’ rule. We also 
assume that it is common knowledge that all the agents behave in the specified manner. 


Example 26.8 Consider a market with two agents, who have private bits x, 
and x2, respectively. Furthermore, assume that the prior probability distribution is 
uniform, so that each of the four possible values for x will have a prior probability 
of i. Now, we introduce a security F based on the OR function f(x) = x; V x3; 
that is, F eventually pays $1 if f(x) is 1. Suppose that agent 1 observed x, = 0. 
Then, conditioned on this information, agent 1 believes P((x,, x2) = (0, 0)) = 
P((x1, x2) = (0, 1)) = . Then agent 1’s initial expectation of the value of F is 
0.5; hence, in our model, she would bid b; = 0.5 in the first round of trading. On 
the other hand, suppose that agent 2 observed x2 = 1. Then, her posterior beliefs 
would be P((x1, x2) = (0, 1)) = P(x}, x2) = 1, 1) = 5. She would know for 
certain that f is 1, and would bid b2 = 1. The clearing price of the market after 
the first round would thus be 0.75. 


26.5.4 Equilibrium Price Characterization 


We now turn to analyzing the equilibrium trading price in the market. Our analysis 
builds on powerful results from the economic literature on common knowledge of 
aggregates. 

Recall that there is a set of possible states Q, together with a common prior proba- 
bility distribution P. As trading proceeds, some possible states can be logically ruled 
out, but the relative likelihoods among the remaining states are fully determined by the 
prior P. So the common knowledge after any stage is completely described by the set 
of states that an external observer — with no information beyond the sequence of prices 
observed — considers possible (along with the prior). Similarly, the knowledge of agent 
i at any point is also completely described by the set of states she considers possible. 


3 Myopic behavior means that agents treat each round as if it were the final round: They do not reason about how 
their bids may affect the bids of other agents in future rounds. 
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We use the notation S” to denote the common-knowledge possibility set after round r, 
and S/ to denote the set of states that agent 7 considers possible after round r. 

Initially, the set of states considered possible by an external observer is the set 
5° = Q. However, each agent i also knows the value of her bit x;; thus, her knowledge 
set Ss? is the set {y € Q|y; = x;}. Agent i’s first-round bid is her conditional expectation 
of the event f(x) = 1 given that x € ae All the agents’ bids are processed, and the 
clearing price p! is announced. From his knowledge of the prior and the information 
structure, the external observer can determine the function price!(x) that relates the 
first round price to the true state x. Thus, he can rule out any vector x that would have 
resulted in a different clearing price. 

Thus, the common knowledge after round 1 is the set S$! = {y € S°| price'(y) = p!}. 
Agent i knows the common knowledge and, in addition, knows the value of bit x;. 
Hence, after every round r, the knowledge of agent i is given by Si = {y € S’|y; = xj}. 
Note that, because knowledge can only improve over time, we must always have 
5S} CS; ~! and §” Cc S’~!. Thus, after a finite number of rounds, we must reach an 
equilibrium after which no player learns any further information. We use S* to denote 
the common knowledge at this point, and S°° to denote agent i’s knowledge at this 
point. Let p® denote the clearing price at equilibrium. 

We now State (without proof) a result that follows immediately from known results 
on common knowledge of aggregates: 


Theorem 26.9 = In the Boolean market, the following conditions must hold at 
equilibrium: 


P(f(y) =1l|y¢S™”) = p™ (26.1) 
Vi P(f(y)=1ly € S*) = p® (26.2) 


Informally, Theorem 26.9 tells us that, at equilibrium, all agents must have exactly 
the same expectation of the value of the security, and that this must agree with the ex- 
pectation of an uninformed observer. Note that they may still have differing knowledge 
sets, as long as conditioning on their respective knowledge sets yields the same expec- 
tation. However, reaching agreement is not sufficient for the purposes of information 
aggregation. We also want the price to reveal the actual value of f(x). The following 
example shows that it is possible that the equilibrium price p™ of the security F will 
not be either 0 or 1, and so we cannot infer the value of f(x) from it. 


Example 26.10 Consider two agents | and 2 with private input bits x; and 
X, respectively. Suppose that the prior probability distribution is uniform, i.e., 
X = (X1, X2) takes the values (0, 0), (0, 1), (1, 0), and (1, 1) each with probability 
i. Now, suppose that the aggregate function we want to compute is the XOR 
function, f(x) = x; ® x2. To this end, we design a market to trade in a Boolean 
security F’, which will eventually payoff $1 iff x; @ x. = 1. 

If agent 1 observes x; = 1, she estimates the expected value of F to be the 
probability that x. = 0 (given x; = 1), which is 5. If she observes x; = O, her 
expectation is the conditional probability that x2 = 1, which is also 5. Thus, in 


either case, agent 1 will bid 0.5 for F in the first round. Similarly, agent 2 will 
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also always bid 0.5 in the first round. Hence, the first round of trading ends with 
a clearing price of 0.5. From this, agent 2 can infer that agent 1 bid 0.5, but this 
gives her no information about the value of x, — it is still equally likely to be 
0 or 1. Agent 1 also gains no information from the first round of trading, and 
hence neither agent changes her bid in the following rounds. Thus, the market 
reaches equilibrium at this point. As predicted by Theorem 26.9, both agents have 
the same conditional expectation (0.5) at equilibrium. However, the equilibrium 
price of the security F does not reveal the value of f(x,, x2), even though the 
combination of agents’ information is enough to determine it precisely. 


26.5.5 Characterizing Computable Aggregates 


We now give a necessary and sufficient characterization of the class of functions f 
such that, for any prior distribution on x, the equilibrium price of F will reveal the true 
value of f. We show that this is exactly the class of weighted threshold functions: 


Definition 26.11 A function f : {0, 1}” — {0, 1} is a weighted threshold func- 
tion iff there are real constants wo, w;, W2,..., Wy Such that 


f(x) = 1 iff wot > wix; > 1 


i=1 


We now state the following results; because of space restrictions, we do not include 
the proof. The OR and XOR examples (Examples 26.8 and 26.10) give some insight 
into these results. 


Theorem 26.12 Jf f is a weighted threshold function, then, for any prior prob- 
ability distribution P, the equilibrium price of F is equal to f (x). 


Theorem 26.13 Suppose f : {0,1}” — {0,1} cannot be expressed as a 
weighted threshold function. Then there exists a prior distribution P for which 
the price of the security F does not converge to the value of f(x). 


26.5.6 Convergence Time 


The model also enables analysis of the number of rounds it takes for the market to 
converge. We state (but do not prove) the results here. 


Theorem 26.14 Let f be a weighted threshold function with n inputs, and let 
P be an arbitrary prior probability distribution. Then, after at most n rounds of 
trading, the price reaches its equilibrium value p® = f(x). 


Theorem 26.15 There is a function C, with 2n inputs and a prior distribution 
P,, such that, in the worst case, the market takes n rounds to reveal the value of 
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26.6 Open Questions 


We conclude with some open questions and future work. 


Combinatorial Prediction Markets 


¢ Section 26.3 discusses combinatorial prediction markets from the auctioneer’s perspec- 
tive. The bidder’s perspective is also interesting to examine. How should bidders choose 
boolean formulas ¢, perhaps subject to constraints or penalties on the number or com- 
plexity of bids? How should bidders choose quantities and prices? 

e Are there less expressive bidding languages that admit polynomial matching algorithms 
yet are still practically useful and interesting? 

e Although exact matching in general is intractable, are there good heuristics that achieve 
matches in many cases, or approximate a matching? In particular, is there a practically 
useful logical reduction algorithm for finding matches? 

¢ We can study permutation combinatorics instead of Boolean combinatorics. In this case, 
the state space Q consists of all possible orderings of a set of statistics, say finish times 
in a horse race. Then a high-level bidding language might allow wagers on events like 
“X, will win,’ “X, will finish in the top 3,” “X, will beat X2,” etc. Are there natural 
bidding languages with tractable matching problems in this setting? 

¢ Can the auctioneer share the surplus partially or fully with the traders? What are the 
incentive properties of the resulting mechanisms? 

¢ What is the complexity of finding a match between a single new order and a set of 
old orders known to have no matches among them? The objective function would be 
to satisfy as much of the new order as possible, giving the advantage of any price 
differences to the new order. (This is the standard continuous double auction rule.) 

¢ We may consider a market to be in computational equilibrium if no computationally 
bounded player can find a strategy that increases utility. Can a market achieve a compu- 
tational equilibrium that is not a true equilibrium? Under what circumstances? 


Automated Market Makers 


¢ For every bidding language that admits a polynomial time matching algorithm as defined 
in Section 26.3, does there exist a corresponding polynomial time market scoring rule 
market maker algorithm? 

e The market makers of Section 26.4 can be considered as simple online algorithms (see 
Chapter 16). Orders arrive one at a time and the market maker must decide to (partially) 
accept or reject the order under a constraint of bounded worst-case loss. Are there other 
online algorithms that can accept more orders for the same worst-case bound on loss? 


Distributed Computation Through Markets 


¢ The market model in Section 26.5 assumes that the clearing price is known with unlimited 
precision. Furthermore, the model assumes that none of the traders are misinformed or 
irrational. What aggregates can be computed even in the presence of noisy prices and 
traders? 
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e If the agents have computed the value of the function and a small number of input bits 
are switched, can the new value of the function be computed incrementally and quickly? 

¢ Inthe model presented, distributed information is aggregated through a centralized mar- 
ket computation. Can we find a good distributed-computational model of a decentralized 
market? 

¢ What is the complexity of the computations that agents must do to update their beliefs 
after each round? 

¢ The model in Section 26.5 directly assumes that agents bid truthfully. Is there a tractable 
model that assumes only rationality and solves for the resulting game-theoretic solution 
strategy? 

¢ The negative results in Section 26.5 (Theorems 26.13 and 26.15) examine worst-case 
scenarios, and thus involve very specific prior probability distributions and initial infor- 
mation states. On the other hand, simulations seem to suggest that almost every threshold 
function’s expected convergence time is near constant, where expectation is taken over 
the common prior. Can we prove results about average-case convergence? 

¢ Nonthreshold functions can be implemented by combining two or more threshold func- 
tions. What is the minimum number of threshold securities required to implement a given 
function? Are there ways to configure securities to speed up convergence to equilibrium? 


26.7 Bibliographic Notes 


This section surveys some of the most directly relevant related work; a more extensive 
bibliography will be made available on the authors’ home pages. We also point readers 
to excellent survey articles on prediction markets by Tziralis and Tatsiopoulos (2006), 
Wolfers and Zitzewitz (2004, in press), and Berg and Rietz (2003). 

A number of studies investigate forecast accuracy and trader behavior on the 
Iowa Electronic Market, one of the longest-running active prediction markets. Berg 
et al. (2001) surveys this work. Other empirical studies examine markets on Trade- 
Sports.com, an Irish betting exchange (Wolfers and Zitzewitz, 2006; Wolfers et al., 
2007; Tetlock, 2004, 2006). Perhaps surprisingly, even play-money market games per- 
form well compared to experts and real-money markets (Chen et al., 2005; Pennock 
et al., 2001a, 2001b; Servan-Schreiber et al., 2004; Spann and Skiera, 2003; Mangold 
et al., 2005). The field tests at Hewlett Packard were conducted by Chen and Plott 
(2002) and Plott (2000). Sunder (1995) reviews a number of laboratory experiments 
involving prediction markets. 

A common concern is that prediction market prices may be manipulated by wealthy 
traders with ulterior motives. Rhode and Strumpf (2006) analyze manipulation attempts 
in real markets and find that the effects of manipulations are typically minimal and 
short lived. Hanson et al. (2006) find that markets appear robust to manipulation in a 
laboratory setting. 

The theory of rational expectations was introduced by Muth (1961) and further 
developed by Lucas (1972). The article by Grossman (1981) is a good introductory 
survey. No-trade theorems (Milgrom and Stokey, 1982) have their roots in the theory of 
common knowledge (Aumann, 1976). Several authors discuss a procedural explanation 
of rational expectations, showing that repeated announcement of an aggregate statistic 


672 COMPUTATIONAL ASPECTS OF PREDICTION MARKETS 


of the agents’ beliefs will drive the agents to a consensus, if they begin with common 
priors (Hanson, 1998; Mckelvey and Page, 1986, 1990; Nielsen et al., 1990). The 
oft-cited efficient market hypothesis (Fama, 1970) is rooted in rational expectations 
theory. 

The analysis of combinatorial prediction markets in Section 1.3 follows Fortnow 
et al. (2005). Chen et al. (2007) conduct an analogous study of permutation conbina- 
torics. Bossaerts et al. (2002) introduce the combined value trading framework, pro- 
viding algorithms for clearing prediction markets when combined orders are allowed. 

The description in Section 26.3.2 of compact prediction markets that take advantage 
of (conditional) independence among events is based on work by Pennock and Wellman 
(2000, 2005). 

Market scoring rules were introduced by Hanson (2003, 2006). Hanson describes 
how the market scoring rule market maker is especially well suited for combina- 
torial prediction markets, and discusses some of the associated computational chal- 
lenges. Scoring rules have long been used to measure forecast accuracy (Savage, 1971; 
Winkler and Murphy, 1968). Dynamic parimutuel markets were introduced by Pennock 
(2004). 

Section 26.5 follows the work of Feigenbaum et al. (2005). Chen et al. (2006) 
examine an extended model where aggregate uncertainty remains in equilibrium. 
Theorem 26.9 follows from a result due to McKelvey and Page; see Nielsen et al. 
(1990) for more details. The market model is based on a model due to Shapley and 
Shubik (1977). Ronen and Wahrmann (2005) investigate a slightly different model 
of prediction games, in which a mechanism designer seeks to compute a function of 
agents’ information, but agents incur a cost to access their own information. 
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Exercises 


26.1 Describe how the market scoring rule market maker of Section 26.5 can be ex- 
tended to handle limit orders of the form “buy at most q units of Sy at price less 
than or equal to p.” For simplicity, assume that partially filled limit orders do not 
remain active in the system. 


26.2 A straightforward implementation of a combinatorial market maker, where Q = 
2EIl requires exponential space to explicitly maintain the vector g, the number 
of shares outstanding of each of the 2!'€!| possible outcomes (states). Derive a 
polynomial-space version of a combinatorial logarithmic market scoring rule mar- 
ket maker, where the input is the list of previously accepted orders and the new 
order and the output is C(q). Orders can be either combined orders or compound 
orders, as defined in Section 26.3. 


26.3 Define the conditional cost function for the logarithmic market scoring rule as 
Cy @) = bIN(Y 5, ey e4/>): the same cost function as before but summed only 
over states in y. The conditional cost function can be used to price conditional 
securities. The cost to buy 6 shares of Spjy, is Cy (q + 6 - 14) — Cy(q). Also, by Bayes’s 
Rule, we know that the instantaneous price of 544 equals the price of Sg,y, divided 
by the price of Sy. 


(a) Verify that the price of Sy, defined in this way integrated from 0 to 6 equals 
Cy (q+ 6+ 19) - Cy(q). 
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(b) After a trader purchases 6 shares of Sg)y, what is the new quantity vector Gnew? 
(Hint: it is not qoig + 8 - 14.) 


26.4 Consider the two-agent “OR” market of Example 26.8. Suppose that X; = 0 and 
X> = 1. Prove that bidding truthfully is not a Nash equilibrium. To do so, it suffices 
to show that if bidder 1 bids truthfully, then bidder 2’s optimal bid is not truthful. 


CHAPTER 27 


Manipulation-Resistant 
Reputation Systems 


Eric Friedman, Paul Resnick, and Rahul Sami 


Abstract 


This chapter is an overview of the design and analysis of reputation systems for strategic users. 
We consider three specific strategic threats to reputation systems: the possibility of users with poor 
reputations starting afresh (whitewashing); lack of effort or honesty in providing feedback; and 
sybil attacks, in which users create phantom feedback from fake identities to manipulate their own 
reputation. In each case, we present a simple analytical model that captures the essence of the strategy, 
and describe approaches to solving the strategic problem in the context of this model. We conclude 
with a discussion of open questions in this research area. 


27.1 Introduction: Why Are Reputation Systems Important? 


One of the major benefits of the Internet is that it enables potentially beneficial in- 
teractions, both commercial and noncommercial, between people, organizations, or 
computers that do not share any other common context. The actual value of an interac- 
tion, however, depends heavily on the ability and reliability of the entities involved. For 
example, an online shopper may obtain better or lower-cost items from remote traders, 
but she may also be defrauded by a low-quality product for which redress (legal or 
otherwise) is difficult. 

If each entity’s history of previous interactions is made visible to potential new 
interaction partners, several benefits ensue. First, a history may reveal information 
about an entity’s ability, allowing others to make choices about whether to interact 
with that entity, and on what terms. Second, an expectation that current performance 
will be visible in the future may deter moral hazard in the present, that hazard being 
the temptation to cheat or exert low effort. In other words, visible histories create an 
incentive to reliably perform up to the entity’s ability. Finally, because histories reveal 
information about abilities, entities with higher abilities will be drawn to participate, 
as they will be distinguishable from those of lower abilities, and respected or rewarded 
appropriately. In other words, visible histories avoid problems of adverse selection. 
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Figure 27.1. Example illustrating reputation system dynamics. 


A reputation system collects, maintains, and disseminates reputations—aggregated 
records from past interactions—of each participant in a community. The rapid advance 
in computational power and communication capacity on the Internet has been a double- 
edged sword: On one hand, it has enabled the construction of reputation systems that 
can store, gather, and process large quantities of information. On the other hand, it 
has allowed more sophisticated attacks on the integrity of the reputation system to be 
mounted. 

Reputation systems have been designed for use in many settings, including online 
auctions, e-storefronts, and a wide-range peer-to-peer systems. These systems naturally 
have differing interfaces, and track different aspects of user behavior. However, they 
all share certain underlying components, which are illustrated in Figure 27.1. 

The core of a reputation system involves collecting records of entity A’s past be- 
havior, and then disseminating reputation information to others who may potentially 
interact with A in the future. (We use the term “entity” to denote the real-world entity 
to which we seek to attach a reputation; typically, this is an individual person, but it 
could also be an organized group or a firm, or a node in a computer network.) The 
records are based on both objective information independently collected about inter- 
actions and feedback from the entities about each other. The exact nature of both the 
objective information and the subjective feedback depends on the application. For an 
online auction, the system may record the agreed sale price and ask the buyer and seller 
to report their satisfaction with each other’s integrity and performance after a trade. In 
a peer-to-peer system, we might ask each peer to monitor and report how often another 
peer makes its system available. 

In principle, user A’s reputation could simply be a concatenation of all records 
pertaining to A, but in practice, reputations are usually numerical summary values that 
permit direct comparison between users. Thus, reputation systems include an internal 
aggregation procedure to convert the reports to reputations. If all reports conform to a 
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common structure, there are two natural dimensions along which to aggregate reports: 
(1) Aggregating across users by computing a statistic of all other users’ reports about 
A. (2) Aggregation across time by computing a statistic of all past reports. In addition, 
the aggregation function may use other structure derived from the reports, or from 
the reputations themselves. In particular, it often relies on some notion of transitivity 
of trust, in the sense that reports from users with high reputation are weighted more 
heavily than reports from users with low reputation. 

Economists have studied models where entities strategically choose actions with an 
eye toward the histories they will generate. In these models, the link between actions 
and outcomes is probabilistic (bad actions sometimes lead to good outcomes and vice 
versa) or outcomes are observed with some error. The analysis of these models is 
interesting and complex, but beyond the scope of this chapter. 

Rather than threats to the informativeness of a user history, we focus our attention on 
threats to the reputation system itself, the system that collects histories and associates 
them with entities. When the histories include subjective feedback, that feedback may 
not be reported or may not be reported honestly. Histories may even include phantom 
feedback from nonexistent interactions. 

A second vulnerability comes from the fact that histories may not be tied directly 
to entities, but rather to online pseudonyms. In many systems, pseudonyms are cheap, 
which lead to two threats: an entity may jettison its pseudonym if it accumulates a bad 
reputation, and an entity may acquire many pseudonyms and have them rate each other 
favorably in order to inflate their reputations. 

To summarize, we consider three threats to the integrity of reputation systems: 


(i) Whitewashing. An entity may acquire a new pseudonym and start over with a clear 
reputation. 
(ii) Incorrectly reported feedback. Entities may not report feedback or may not report it 
honestly. 
(iii) Phantom feedback. An entity may provide feedback for interactions that never took 
place, perhaps using “sock puppet” identities (or sybils) created for the sole purpose 
of providing such phantom feedback. 


We begin in Section 27.2 with a stylized model of interactions over time in a market. 
Initially, in Section 27.3, we assume that the available objective data about interactions 
are sufficient to generate informative histories, even without any reporting of subjective 
feedback. We consider the threat of whitewashing, where an entity can start over with 
a new pseudonym, which will not be linked to the history of actions taken under the 
previous pseudonym. Reputations can still create an incentive for good behavior, but 
only if a pseudonym with no history is forced to “pay its dues” in some fashion while 
it builds up a history of good actions. 

Section 27.4 relaxes the assumption of objective data about actions. Feedback about 
interactions may not be reported correctly. Entities may not report feedback or may 
not report it honestly, for a variety of reasons, including fear of retaliation, or a desire 
to be viewed as a nice or skilled evaluator. 

One approach is to treat the reporting of feedback about an action as itself an 
action in some other domain. A history of feedback reports made by an entity can be 
generated and, suitably aggregated, becomes an entity’s reputation as a rater. Just as 
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in any reputation system, rater reputations can deter moral hazard, creating incentives 
for effort and honest reporting. It may, however, be difficult to assess the quality of 
subjectively reported feedback. We present a mechanism that does so by comparing it 
with other subjectively reported feedback. 

Section 27.5 takes a second approach. Rather than directly assessing the quality of 
subjectively provided feedback, it assumes that an entity’s reputation as a rater is the 
same as its reputation as an actor in the original domain. This leads to a notion of 
transitive trust: if an entity’s actions in the original domain lead it to have a positive 
reputation, the entity is presumed to be a good rater as well, and its ratings are treated 
as more credible and weighted more highly in computing the reputations of other 
entities. For example, positive feedback from an eBay member with a good reputation 
would count more than positive feedback from a member with a bad reputation. This 
naturally leads to a graph model that represents entities and their feedback about other 
entities, with actions in the original domain not represented explicitly. Reputations are 
computed as scores for nodes in the graph, subject to the constraints imposed by the 
link structure of feedback among entities. We present both possibility and impossibility 
results on how transitive trust algorithms can handle the threats of incorrectly reported 
feedback and the problem of phantom feedback from sock puppet entities, the so-called 
sybil attack. 


27.2 The Effect of Reputations 


Economists have developed many game-theoretic models of the impact of reputations. 
In this section we present some of the fundamental ideas and technical tools necessary. 
We begin with an (over)simplified example. 

Consider the “prisoners’ dilemma,” a classic model from the early days of game 
theory. There are two agents, Alice (A) and Bob (B), who interact. If both agents 
cooperate (C) then each gains 1 unit of utility, while if they both defect they gain 
0; however if one cooperates and the other defects (D), the defector gains 2 and the 
cooperator loses 1. We summarize this as 74(C, C) = 1, w4(D, D) = 0, r4(D, C) = 
2, and z4(C, D) = —1. 7g is similarly defined via symmetry. 

Clearly the outcome of this game, when played a single time, should be (D, D) since 
it is a dominant strategy for both agents. In an infinitely repeated game, however, a 
player may choose C and accept lower payoffs in one round to increase the probability 
that partners will play C against her in future stage games, and thus increase her future 
payoffs. We denote the game played in each round as the stage game for that round. 

Define the discounted payoff to player i in stage game f to be 7/6‘, where x’ 
is the actual payoff in round ¢ and 0 < 6 < 1 is the discount factor. The idea of a 
discount factor is that it is somehow preferable to get a payoff in the current round 
rather than in the next round. If the payoffs are monetary, the possibility of investing 
the payoff at some interest rate provides a good intuition for why a discount factor is 
needed. 

We will analyze strategy alternatives that consist of decision rules about which action 
to play in each stage game, contingent on a player’s own history and the histories of 
all other players. The discounted average payoff of a strategy, played infinitely into the 
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future, is defined as 
CO 
m=(1- 5) So Sx}. 
t=0 


In this infinitely repeated model, consider the Grim strategy: play C unless any player 
has played D in a previous round. This strategy, pursued by both players, denoted 
(Grim, Grim), is a Subgame Perfect Nash Equilibrium (SPNE), meaning that, if all 
players pursue this strategy, there is no stage game at which any player would want to 
deviate from the strategy. 

To prove this is a SPNE, we only need to consider “single deviations” in which an 
agent only deviates from Grim once and then returns to playing it. This follows from 
a generalization of the single deviation property in dynamic programming. 

Consider a deviation in which Alice plays D in a round 0. Clearly this will lead 
to (D,D) in all future rounds for Alice (for everyone, in fact), so Alice’s discounted 
average payoff will be (1 — 6)(2+6x04 6* x0 +--+) =2(1 — 54); however, if she 
did not deviate, then her payoff would be | in every period leading to (1 — 6)1 + 6+ 
6° +---) = 1. Thus, deviating is not advantageous when 1 > 2(1 — 4) or, equivalently, 
5 > 1/2. Now, this same argument applies to any period t > 0 with both sides of the 
equations multiplied by 6’. 

Thus, when 6 is small, the promise of future payoffs is not sufficient to constrain the 
player’s current behavior. This is true in all reputation systems: if the players do not 
value future payoffs sufficiently, then reputations are of no value. 

Other strategies that are “less grim” can also work. For example, punishing for only 
a small number of periods can lead to a cooperative equilibrium for higher values of 6. 

Now consider a group of N + 1 players with N odd, in which in each round players 
are paired up at random and play the prisoners’ dilemma. In a simple reputational 
extension of the above analysis we consider reputational-grim, defined as follows: each 
agent begins with a “good” reputation and keeps it if she plays C against players with 
good reputations and D against those with bad ones. This reputational-grim strategy, if 
played by all players, is also an SPNE, for 6 > 1/2. This is because, from an defector’s 
perspective, the punishments are the same as in the full Grim strategy. 

To understand the value of shared reputations, consider an alternative system where 
a player remembers others’ interactions with her but histories are not publicly shared. 
A natural strategy is to play personalized-Grim, the variant of Grim where a player 
views the game as being separated into N unrelated games, one with each opponent. In 
this case, the expected number of rounds between meeting the same opponent is N so 
a straightforward calculation (see exercises) yields a condition for this to be an SPNE, 
6 => 1 —1/2N, which is unreasonably close to 1, for large N. 

The analysis above applies to situations in which all players have the same ability, but 
reputations lead them to strategies where they are reliable partners. To operationalize 
varying player abilities, models allow different players different action sets to choose 
from in the stage game. For example, a low-ability player might only have action D 
available (or perhaps in some percentage of stage games have only action D available). 
A high-ability, honest type might only have action C available. Alternatively, it might 
take more effort (cost more) for a low type to play C than for the high type. This could 
arise where C indicates the completion of a high-quality product. (Player types with 
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only one possible action are called “commitment” types in the economics literature.) 
Players with both types of action available (called “strategic” types in the economics 
literature) would then want to choose actions that distinguish them from low-ability 
players and mimic those of high-ability players. 

It is also natural to extend the model to situations in which outcomes are only prob- 
abilistically linked to actions, or outcomes are reported with random error. This leads 
to interesting strategic opportunities, including playing C most of the time but some- 
times choosing D, which would not be immediately distinguishable from the actions of 
high-ability honest types who also have bad outcomes only less frequently. The anal- 
ysis of these models is interesting and complex, but beyond the scope of this chapter. 
(However, in the following section we will consider random outcomes in a limited way.) 


27.3 Whitewashing 


One key issue in online reputation systems is the fragility of identity. Agents with bad 
reputations simply reregister with a new username. This is known as whitewashing. It 
is easy to see that the ability to whitewash will disable the functioning of the reputation 
systems as described in Section 27.2, as agents will simply choose D and then return 
with a new identity in the following round. 

To prevent this, there needs to be some “initiation fee” upon entry. For example, 
simply having an upfront cost of f to register will prevent whitewashing as long as 
the cost is sufficiently high. To compute this f note that the total discounted payoff 
for deviating once is 7’ = (1 — 6)(2— f +d(1 — f) + 6% +. 63 ---) while following 
reputational grim obtains 7 = (1 — 6) — f +64 6*+---). Thus for an SPNE we 
need 7 > 7’, which implies that df > 1 or f > 1/5. (Note that we continue to require 
that 6 > 1/2 to prevent deviation without whitewashing.) 

Unfortunately collecting fees is not always feasible (or politically viable); however, 
we can create an explicit reputational fee. The key idea is to force the new arrivals to 
“pay dues” upon arrival. The most efficient way to do this is to allow veterans to defect 
against newcomers, where newcomers are playing for the first time (apparently) and 
veterans have played at least once before. Thus, we can define the pay-your-dues (PYD) 
strategy as: play C against any veteran who has never deviated from PYD, otherwise 
play D against the veteran. Play D against a newcomer, unless you are a newcomer too, 
in which case play C. 

Intuitively, this leads to the “socially most efficient” SPNE, where social efficiency 
measures the sums of all players’ payoffs. Note, however, that the social efficiency 
in this equilibrium is less than the maximum social efficiency that could be attained 
without whitewashing. This follows because the maximum social welfare in a single 
pair playing the PD is 2 while choosing (D, C) yields a value of 2 — 1 = 1. (One might 
consider requiring that newcomers play D against other newcomers, but this obtains a 
value of 0 and entails further social loss.) Thus, the possibility of whitewashing leads 
to an unavoidable cost being imposed on society. 

Even allowing for whitewashing, PYD leads to an SPNE where every player’s 
average discounted payoff is 1. (You should verify this as in the exercises.) However, we 
have left out several important details in this model that we discuss in the next section! 
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27.3.1 A More Dynamic Model 


Stepping back, we see that the model we just analyzed has a flaw, since any newcomers 
in our model are clearly whitewashers. Thus, for that model, always playing D against 
an agent who arrived after the first period (and personalized-grim otherwise) yields a 
fully socially efficient SPNE, since (C, C) is played in every interaction. 

Thus, it makes sense to extend our model to capture these issues; although the 
difficulty is retaining tractability. First, we assume aN real newcomers arrive every 
period and an equal number of veterans depart, where the departing veterans are chosen 
at random. However, once again this allows us to easily detect whitewashers—if there 
are more than aN newcomers in any period then the players know that there must be 
at least one whitewasher. Thus, there is an equilibrium in which players play PYD as 
long as there are exactly aN newcomers in any period and play D-always if there are 
ever more. However, it is clear that this equilibrium is extremely fragile, since a single 
deviation leads to all players defecting forever. Such fragile equilibria are artifacts of 
the “noiselessness” of the game and the perfect rationality assumptions inherent in 
game theory. 

To make our model more robust, we add some “noise.” We assume that in any play 
of the stage game a player accidently plays D with probability « > 0 and then returns 
in the following period as a whitewasher. In this model, one can show that PYD leads 
to the most efficient equilibrium (i.e., the highest fraction of cooperative outcomes 
(C, C)). Proving that PYD is an equilibrium is intuitively similar to above proofs with 
the addition of some ideas from dynamic programming, while proving optimality is 
more difficult and requires a careful stochastic analysis. 

The PYD strategy in this stylized model corresponds in more practical settings to 
a mistrust of newcomers. Until they have proven themselves, veterans do not trust the 
newcomers sufficiently to allow them to undertake mutually beneficial interactions. 
If only the veterans could trust the newcomers, the newcomers could start right 
away to interact in beneficial ways with the veterans. The threat of whitewashing, 
however, forces a mistrust of newcomers. Because of the threat of whitewashing, in 
any equilibrium, newcomers must also be penalized at least the amount that a deviator 
would be penalized. 

The only way to improve the treatment of newcomers in an equilibrium with sig- 
nificant cooperation is to make whitewashing difficult, by making it more difficult or 
expensive for existing participants to get new pseudonyms than it is for newcomers. 
For example, the organization running the reputation system might require entities to 
reveal their true names, offer them one free pseudonym, and then restrict the acquisition 
of addition ones or require a payment for them. 


27.4 Eliciting Effort and Honest Feedback 


The previous section described models in which feedback was reported automatically 
and objectively. Any system that actually solicits individual opinions must overcome 
two challenges. The first is underprovision. Forming and reporting an opinion requires 
time and effort, yet the information benefits others. The second challenge is honesty. A 
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desire to be nice, or fear of retaliation, may cause a rater to withhold negative feedback. 
Conflicts of interest or a desire to improve others’ perception of them may lead raters 
to report distorted versions of their true opinions. 

An explicit reward system for honest rating and effort may help overcome these 
challenges. When objective information will be publicly revealed at a future time, indi- 
viduals’ reports can be compared to that objective information. For example, weather 
forecasts and sports betting odds can be compared to what actually occurs. See Chapter 
26 on information markets for algorithms that create incentives for honest revelation 
of information in such settings. 

Here, we develop methods to elicit feedback effectively when independent, objective 
outcomes are not available. Examples include situations where no objective outcome 
exists (e.g., evaluations of a product’s “quality”), and where the relevant information 
is objective but not public (e.g., a product’s breakdown frequency, which is available 
to others only if the product’s current owners reveal it). 

In these situations, one solution is to compare raters’ reports to their peers’ reports 
and reward agreement.'! However, if rewards are made part of the process, dangers 
arise. If a particular outcome is highly likely, such as a positive experience with a seller 
at eBay who has a stellar feedback history, then a rater who has a bad experience will 
still believe that the next rater is likely to have a good experience. If she were to be 
rewarded simply for agreeing with her peers, she will not report her bad experience. 
This phenomenon is akin to the problems of herding or information cascades. 

We now describe a formal mechanism, the peer-prediction method, to implement 
the process of comparing with peers. The scheme uses one rater’s report to update 
a probability distribution for the report of someone else, whom we refer to as the 
reference rater. The first rater is then scored not on agreement between the ratings, but 
on a comparison between the probabilities assigned to the reference rater’s possible 
ratings and the reference rater’s actual rating. Raters need not perform any complex 
computations: so long as a rater trusts that the system will update appropriately, she 
will prefer to report honestly. 

Scores can be turned into monetary incentives, either as direct payments or as 
discounts on future merchandise purchases. In many online systems, however, raters 
seem to be quite motivated by prestige or privileges within the system. For example, 
at Slashdot.org, users accumulate “karma” points for various actions and higher karma 
entitles users to rate others’ postings and to have their own postings begin with higher 
ratings; at ePinions.com, reviewers gain status and have their reviews highlighted 
if they accumulate points. Similarly, offline point systems that do not provide any 
tangible reward seem to motivate chess and bridge players to compete harder and more 
frequently. 


' Subjective evaluations of ratings could be elicited directly instead of relying on correlations between ratings. For 
example, the news and commentary site Slashdot.org allows meta-moderators to rate the ratings of comments 
given by regular moderators. Meta-evaluation incurs an obvious inefficiency, since the effort to rate evaluations 
could presumably be put to better use in rating comments or other products that are a site’s primary product of 
interest. Moreover, meta-evaluation merely pushes the problem of motivating effort and honest reporting up one 
level, to ratings of evaluations. Thus, scoring evaluations in comparison to other evaluations may be preferable 
in certain settings. 
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27.4.1 A Model 


We now consider a model to analyze these issues. A number of raters experience a 
product and then rate its quality. The product’s quality does not vary, but is observed with 
some idiosyncratic error. After experiencing the product, each rater sends a message 
to a common processing facility called the center. The center makes transfers to each 
rater, awarding or taking away points based on the raters’ messages. The center has no 
independent information, so its scoring decisions can depend only on the information 
provided by other raters. As noted above, points may be convertible to money, discounts 
or privileges within the system, or merely to prestige. We assume that raters’ utilities 
are linear in points. We also assume that raters are risk neutral, and hence, seek to 
maximize expected wealth. 

We refer to a product’s quality as its type. Assume the number of product types 
is finite, and the types are indexed by t = 1,..., 7. Furthermore, we assume that 
there is a commonly known prior probability. Let Pro(t) be the commonly held prior 
probability assigned to the product’s being type t. Assume that Pro(t) > O for all ¢ and 
1 Pro(t) = 1. 

Let J be the set of raters, where |/| > 3. J may be (countably) infinite. Each rater has 
a perception of a product’s type, which is called her signal. Each rater privately observes 
her own signal; she does not know any other rater’s signal. Let S = {s,,..., sy} be 
the set of possible signals, and let S’ denote the random signal received by rater 
i. Conditional on the product’s type, raters’ signals are independent and identically 
distributed; the distribution is represented by function f(s,,|f) = Pr(S' = s,,|t), where 
Ff(Sm|t) > 0 for all s,, and t, and See f(m|t) = 1 for all t. We assume that this 
function f(s,,|t) is common knowledge. Furthermore, we assume that the conditional 
distribution of signals is different for different values of t, so that the signals are 
informative about the types. 

Throughout this section, we use the following simple example as an illustration. 
There are only two product types, H and L, with prior Pro(H) = Pro(L) = 0.5, and two 
possible signals, 4 and /. The distribution of the signals, conditioned on the true type, 
is as follows: f(h|H) = .85, fU|H) = 0.15, f(h|L) = 0.45, f(|L) = 0.55. Thus, 
Pr(h) = 0.5 * 0.85 + 0.5 * 0.45 = 0.65. 

In the mechanism we propose, the center asks each rater to announce her 
signal. After all signals are announced to the center, they are revealed to the other 
raters and the center computes transfers. We refer to this as the simultaneous 
reporting game. Let x' € § denote one such announcement, and x = (x!,...,x/) 
denote a vector of announcements, one by each rater. Let x!, € S denote rater i’s 
announcement when her signal is s,,, and x! = Ge, ae ee) eS” denote rater 
i’s announcement strategy. Let ¥ = (x!,...,%/) denote a vector of announcement 
strategies. As is customary, let the superscript “—i” denote a vector without rater i’s 
component. 

Let 1;(x) denote the transfer paid to rater i when the raters make announcements 
x, and let t(x) = (1)(x), ..., T7(x)) be the vector of transfers made to all agents. An 
announcement strategy x’ is a best response to x~' for player i if for each m: 


VRP ES Bs [ti (IS = se] = Eo [G2 OLS = sn]. CTA) 
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That is, a strategy is a best response if, conditional on receiving signal s,,, the an- 
nouncement specified by the strategy maximizes that rater’s expected transfer, where 
the expectation is taken with respect to the distribution of all other raters’ signals con- 
ditional on S' = s,,. Given transfer scheme t(x), a vector of announcement strategies 
x is a Nash Equilibrium of the reporting game if (27.1) holds fori = 1,..., 7, anda 
strict Nash Equilibrium if the inequality in (27.1) is strict for alli = 1,..., J. 

Truthful revelation is a Nash Equilibrium of the reporting game if (27.1) holds for 
all i when x/, = 5, for all i and all m, Furthermore, truthful revelation is a strict Nash 
Equilibrium if the inequality is strict. (In other words, if all the other players announce 
truthfully, truthful announcement is a strict best response.) 

Continuing the two-type, two-signal example, suppose that rater i receives the signal 
1. Recall that Pro(H) = 0.5, f(h|H) = 0.85, and f(h|L) = 0.45, so that Pr(s/) = 0.35. 
Given i’s signal, the probability that rater j will receive a signal h is 


Pris! = his! =) = py OD + pay ee 
= 0.95025 | 04s = ~ 0.54. 
If i had instead observed h, then: 
Pr(S! = hls! =m) = fay) + pany 
0.85 * 0.5 0.45 « 0.5 
= 0.85 + 0.45 = 0.71. 


27.4.2 Peer-Prediction Scoring 


We now describe how to assign points to a rater 7, based on her report and that of another 
player j. A scoring rule is a function T(s|x') that, for each possible announcement x! 
of S', assigns a score to each possible value s € S. We cannot directly access the signal 
s;, but in a truthful equilibrium, we can use player j’s report. 


Definition 27.1 A scoring rule is strictly proper if the rater maximizes her 
expected score by announcing her true beliefs. 


The literature contains a number of strictly proper scoring rules for eliciting beliefs 
about the probability of an event. The score can be positive or negative. For example, one 
proper scoring rule, the logarithmic scoring rule, is to penalize the player the log of the 
probability she assigned to the event that actually occurred. Suppose that there are only 
two possible events (/,/), and a player is asked to report her belief p of the probability 
of event h. The log scoring rule is defined by T(h|p) = In(p), T(/|p) = Ind — p). 
If her true belief is that h occurs with probability p, then the expected value of 
announcement p is pInp +(1 — p)In(1 — p). Setting the first derivative to 0 gives 
the first-order condition for maximization, which requires p = p. 

In the peer prediction method, for each player we choose a reference rater r(i). The 
outcome to be predicted is the reference rater’s announcement x’. Player i does not 
directly report a probability distribution over the reference rater’s report: it is inferred 
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from her own report and the prior probability distribution. Truthful reporting is still a 
best response if she believes that the reference rater will report honestly. 

We write T(x" |x!) for In[Pro(S" = x’"|S' = x!)], ie., the log of the inferred 
probability that r(i) will see x” given that S' sees signal x'. Then, let 


Ti (x', St) = Tie a) ; (27.2) 


Proposition 1 For any mapping r that that assigns to each rater i a reference rater 
r (i) A i, truthful reporting is a strict Nash equilibrium of the simultaneous reporting 
game with transfers T;*. 


PROOF Assume that rater r (i) reports honestly: x” (sm) = S» for all m. Since 
S! is stochastically informative for S’, and r(i) reports honestly, S' is stochas- 
tically informative for r (i)’s report as well. For any S' = s*, player i chooses 
x! € § to maximize 
M 
SEO er SeSairSe). (27.3) 
n=1 
Since J (-|-) is a strictly proper scoring rule, (27.3) is uniquely maximized by 
announcing x! = s*. Thus, given that rater r(i) is truthful, rater i’s best response 
is to be truthful as well. 


Since 0 < Pr(S,() = Sn|S; = s*) < 1, In(Pr(S-() = Sn|S; = s*)) < 0; we refer to 1;* as 
rater i’s penalty since it is always negative in this case. (By adding a suitably large 
constant that depends only on the distribution /f,, it is in principle possible to convert 
this to a positive score without altering its strategic properties.) 

Consider the simple example where rater i received the relatively unlikely signal 
1 (Pr(S' = 1) = 0.35). Even contingent on observing / it is unlikely that rater j will 
also receive an / signal (Pr(S/ = 1|S' = 1) = 1 — 0.54 = 0.46). Thus, if rater i were 
rewarded merely for matching her report to that of rater 7, she would prefer to report 
h. With the log scoring rule, an honest report of / leads to an expected payoff 


In[Pr(S/ =h|S' =1)] Pr(S/ =h| S' =I) + In[Pr(S/ =1| S' =D] Pr(S/ =1|S' =1) 
= In(0.54)0.54 + In(0.46)0.46 = —0.69. 
If, instead, she reports h, rater i’s expected score is 
In[Pr(S/ =h|S' =h)] Pr(S/ =h|S' =1) + In[Pr(S/ =1|S' =h)] Pr(S/ =1|S' =1) 
= In(0.71)0.54 + In(0.29)0.46 = —0.75. 
As claimed, the expected score is maximized by honest reporting. 

The key idea is that the scoring function is based on the updated beliefs about the 
reference rater’s signal, given the rater’s report, not simply matching a rater’s report to 
the reference report. The updating takes into account both the priors and the reported 
signal, and thus reflects the initial rater’s priors. Thus, she has no reason to shade her 


report toward the signal expected from the priors. Note also that she need not perform 
any complex Bayesian updating. She merely reports her signal. As long as she trusts the 
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center to correctly perform the updating and believes other raters will report honestly, 
she can be confident that honest reporting is her best action. 

Note that while Proposition 1 establishes that there is a truthful equilibrium, it is 
not unique, and there may be nontruthful equilibria. To illustrate, in the example we 
have been considering two other equilibria are (1) report h all the time, and (2) report 
all the time.? While such nontruthful equilibria exist, it is reasonable to think that 
the truthful equilibrium will be a focal point, especially when communication among 
raters is limited, or when some raters are known to have a strong ethical preference 
for honesty. In addition, the center can punish all the raters if it detects a completely 
uninformative equilibrium such as all A or all /. 

A variety of extensions to this base scoring rule have been studied. For example, 
adding a constant value to the score increases the expected payoff without changing 
the incentives for honest revelation. Multiplying the score by a constant preserves the 
incentive for honest revelation but changes the amount of costly effort a rater will want 
to exert in order to acquire an informative signal. The points that each person earns can 
be debited from some other participant, so that all scores are settled through transfer 
payments rather than subsidies from the center. Alternative proper scoring rules to 
reduce the expected size of payments have also been studied. 

The payments can be adapted to a sequential interaction scenario where each rater 
sees the previous rater’s reports before reporting herself. Each rater is scored on the 
basis of the probability distribution inferred from the common prior beliefs, her own 
report, and previous reports. Since the center will take into account others’ reports 
automatically, it is optimal to report just her own signal. 

The most problematic aspect of the scoring mechanism is its reliance on common 
prior beliefs about the distribution of types and the distribution of signals contingent 
on types. These are needed to infer from a user’s reported signal x; the probability 
distribution R for the reference rater’s signal, which is used to determine the user’s 
point score. A seemingly attractive alternative is to elicit R directly, but player i may 
also be a reference rater for some other player, and so x; must be truthfully elicited to 
score that other player. 

The requirement of common priors can be relaxed somewhat if each player is asked 
to report her personal priors about the item’s type before receiving her information 
signal about the item, and then to report her signal once she receives it. There still is a 
requirement of common beliefs about the distribution of signals contingent on types, 
in order to perform Bayesian updating correctly. One solution would be to define the 
types empirically according to the distribution of signals they elicit (e.g., type 1 yields 
10% h signals; type 2 yields 20%, etc.) Then, the beliefs about distribution of signals 
contingent on type would, by construction, be commonly held. 

Many open questions remain about the peer-prediction method. Can it be extended 
to situations in which raters vary in their abilities and scores are used both to assess the 
credibility of raters and to give them incentives for effort and honest reporting? Can the 
method be extended to situations in which entities choose their interactions partners 


2 To verify the “always play h equilibrium,” note that if the reference rater always reports high, the rater expects 
In(0.54)1 + In(0.46)0 = —0.616 19 if she reports /, and In(0.71)1 + In(0.29)0 = —0.342 49 if she reports h. 
Similar reasoning verifies the “always play / equilibrium.” 
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rather than being randomly matched? Can it be made robust to collusion among entities 
or sybil attacks with fake entities providing confirmatory ratings? 


27.5 Reputations Based on Transitive Trust 


In this section, we discuss the transitive trust approach to dealing with the lack of 
objective feedback. The foundation of this approach is the postulate that the credibility 
of an agent’s feedback is tied to the credibility of her non-feedback actions. This 
assumption enables the construction of reputation systems in the absence of any external 
signals of interaction outcomes or feedback quality: an entity’s reputation is calculated 
by weighting ratings of the entity according to the raters’ credibilities, which are in 
turn calculated from those raters’ reputations. Thus, if we begin with some set of 
credible agents, we can potentially grow this set transitively: If the currently credible 
agents have positive feedback about 7, i can be included in the set of credible agents. 
This is a recursive construction; we need to carefully define how to bootstrap the 
credibility calculation, how to propagate the credibility through the network, and when 
to terminate the calculation. 

One additional simplification is often employed in reputation algorithms, which is 
to ignore the temporal order in which feedback is received. Now, the feedback can be 
succinctly expressed in graphical form: At a given point of time, let t(ij) denote the 
summary feedback (trust) that i reports about j, based on interactions between them 
thus far. We assume that the trust can be expressed as a nonnegative real value. Then, the 
input to the reputation system can be viewed as a “trust graph” G = (V, E, t), where 
V is the set of agents, E the set of directed edges, andr: E > Xt \ {0} the weights. 
(Note that typically the graph will be quite sparse, so for algorithmic considerations 
we explicitly include E.) 

We assume that the reputations computed by our system are numeric values. Then, 
the reputation aggregation mechanism can be represented as a function from a trust 
graph to a set of reputation values, F: G > %!"!, where F,(G) is the reputation value 
of vertex v € V. The reputation values determine an ordering or ranking of the nodes. 
A reputation function is trivial if the ranking induced by F(G) is constant over all G; 
we restrict our attention to nontrivial reputation functions. 

This model captures the many reputation systems that have been proposed or used 
in practice. One important example is PageRank, a mechanism used by Google to rank 
Web pages. In this case v € V is a Web page, (v, w) € E is a directed edge showing 
that Web page v has a hyperlink to page w and t(v, w) = 1/Out(v), where Out(v) is 
the outdegree of v. In a peer-to-peer system, v € V is a peer, (v, w) € E is a directed 
edge showing that peer v has interacted with w and t(v, w) represents the degree of 
trust that v has in w, which can depend on the number, type, and outcomes of v’s 
interactions with w. 

There are numerous ways in which the reputations can be computed from the trust 
graph. We consider a simple version of PageRank, in which the ranking function is 
given by 


F(G)=e+-6) D) Fu(@re’,v). 


v'|\(v vyeE 
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Another interesting aggregation function, used in the Advogato system, is the max-flow 
algorithm, where F,,(G) is the maximum flow from some start node vg € V to v. In 
the P2P setting it is natural to create personalized reputation functions where each 
node uses itself as the start node. In the web ranking setting one can simply choose 
one (or several) “trusted” nodes as the start nodes. Lastly, for comparison, we consider 
the Pathrank algorithm where F,(G) is the shortest path from some start node v9 € V 
to v, where the length of an edge is simply the inverse of the trust value. 

A reputation system is monotonic if adding an incoming edge to v never reduces 
the rank of v relative to any other node w, ie., for E’ = E U {uv}, F,(V, E) > F,,(V, 
E) => F,(V, E’) > F,(V, E) and F,(V, E) = F,(V, E) => FV, E’) => FV, E’). 
All the reputation schemes described above are monotonic. A reputation system is 
symmetric if the function F commutes with permutation of the node names, i.e., the 
reputations depend only on the graph structure, and not on the labels of the nodes. 
The simple variant of PageRank described above is symmetric, but the other reputation 
functions are not: the start node vg enjoys a privileged position. 


27.5.1 Incentives for Honest Reporting 


With the transitive trust model, the incentive problems are particularly acute. Entities 
are not rewarded or penalized directly for the quality of the ratings they provide, only 
for the ratings they receive from others. Thus, an entity has no incentive to provide 
informative feedback. Furthermore, depending on the reputation function F’, she may 
have a strong incentive to provide incorrect feedback, so as to influence the credibility 
of other agents’ feedback about herself. 

Therefore, we would like a reputation function F in which an agent v cannot 
strategically choose feedback to boost her own standing. Define a reputation system as 
rank-strategyproof if, for every graph G and every agent v € V, agent v cannot boost 
her rank ordering by strategic choices of how she rates other agents. This formulation 
allows an agent to manipulate its own or others’ reputation scores as long as it is unable 
to improve its position in the rank ordering of reputation scores. 

It turns out that rank-strategyproofness is very difficult to achieve in symmetric 
reputation systems: Any nontrivial, monotonic reputation system that is symmetric 
cannot be rank-strategyproof. For example, in the PageRank ranking system, a node v 
may be able to improve her rank by dropping an outgoing edge vu to a higher-ranked 
node u, thereby reducing u’s reputation. We refer readers to the references at the end 
of this chapter for additional results in this vein. We note that this impossibility result 
does not apply to nonsymmetric reputation systems; the Pathrank function satisfies 
both the rank-strategyproofness and monotonicity properties. 


27.5.2 Sybils and Sybilproofness 


Next, we consider robustness to another attack on reputation systems: sybil attacks. In 
a sybil attack, a single agent creates many fake online identities to boost the reputation 
of its primary online identity. Formally, we assume that a node can create any number 
of sybil nodes, with any set of trust values between them. In addition, we allow the 
node to divide incoming trust edges among the sybils in any way that preserves the 
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total trust, pe, Giner t(v’, v), and manipulate the outgoing trust links in any manner 
it chooses. Note that many other formulations are possible depending on the specific 
system being modeled. Most of the results we discuss below hold in many of the other 
possible formulations. 


Definition 27.2. Given a graph G = (V, E, t) and a user v € V, we say that a 
graph G’ = (V’, E’, t’) along with a subset U’ C V’ is a sybil strategy for user v 
in the network G = (V, E, t) if v € U’ and collapsing U’ into a single node with 
label v in G’ yields G. We can refer to U’ as the sybils of v, and denote a sybil 
strategy by (G’, U’). 


We define two different notions of sybilproofness for reputation functions. 


Definition 27.3 A reputation function F is value-sybilproof if for all graphs 
G =(V, E), and all users v € V, there is no sybil strategy for v, (G’, U’), with 
G' =(V’, E’) such that for some u € U’, F,(G’) > F,(G). 


Definition 27.4 A reputation function F is rank-sybilproof if for all graphs 
G =(V, E), and all users v € V, there is no Sybil strategy (G’, U’) for v (with 
G' =(V’, E’)) such that, for some u € U' and we V \ {v}, F,(G’) => Fy (G) 
while F,(G) < Fy(G). 


Theorem 27.5 There is no (nontrivial) symmetric rank-sybilproof reputation 
function. 


PROOF Givena graph G = (V, E£) and reputation function F, let v, w € V with 
F.,(G) > F,(G). Now consider the graph G’, which is simply 2 disjoint copies 
of G, where U is the second copy of G combined with v. By symmetry, there is 
anode u € U such that F,,(G’) = F,,(G’). Thus F is not rank-sybilproof. 


Note that this result does not require the assumption that F is monotonic. In fact, 
symmetric reputation functions cannot be sybilproof even for an attack with a single 
sybil. 


Definition 27.6 We say that a reputation function is K-rank-sybilproof if it is 
rank-sybilproof for all possible sybil strategies (G’, U’), with |U’| < K +1. 


Theorem 27.7 There is no symmetric K-rank-sybilproof nontrivial reputation 
function for K > 0. 


PROOF Consider the graphs in the previous example, where V = {v= 
V1, U2,..-,U, = w} is the original vertex set and U = {u,u2,...,u,} is 
the duplicate; let V’ = VUU. Define G' to be the subgraph of G’ with 
V'=VU{u,...,u;} and G° = G. Then F,,(G°) > F,(G®), while F,,(G") = 
F,(G") (where u, is the copy of node v, = w), so there must exist a 
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t such that Maxje(ou...u) FG") < Fy(G!), but Maxpeteuy uta) FAG) = 
F,,(G‘*'). Let m be the node in {v, u;,..., u;} that achieves the greatest rep- 
utation in G't!. Then either F,,(G‘t!) > F,,(G‘T') or F,,,,(Gt!) > Fy(G't!). 
It follows that the addition of node u,,; is a successful sybil strategy for m in G’. 
Hence, F is not 1-rank-sybilproof on all graphs. 


Now, consider PageRank. It is clearly symmetric—changing the labels on the nodes 
does not change the reputation values. This immediately implies that it is not rank- 
sybilproof. 

One natural approach to overcoming this result is to break the symmetry of the 
reputation system by using a specific trusted node (or nodes) as a seed. However, care 
is still needed to achieve robustness against sybil attacks. Here, we consider two simple 
reputation functions that are provably sybil-resistant. 

We first consider the max-flow based ranking mechanism. It is easy to show that it 
is value-sybilproof. 


Theorem 27.8 The max-flow based ranking mechanism is value-sybilproof. 


PROOF This follows directly from max-flow equals min-cut after noticing that 
all sybils of v € V must be on the same side of the cut as v and thus on the other 
side of the cut from the source s. Thus, no sybil can have a value higher than the 
min-cut which is equal to F,,(G). 


However, the max-flow based ranking mechanism is not rank-sybilproof, as the ex- 
ample in Figure 27.2 shows. This is because while v € V cannot increase its own value, 
it can reduce the value of nodes for which it is on a max-flow path. Nonetheless, there 
do exist nontrivial rank-sybilproof algorithms. The Pathrank reputation mechanism is 
one example: 


Theorem 27.9 The Pathrank ranking mechanism is value-sybilproof and rank- 
sybilproof. 


PROOF It is value sybilproof since sybils cannot decrease the length of the 
shortest path. Rank-sybilproofness follows from the fact that the only time a node 
v can affect the value of another node w is if v is on the shortest path from s to 
w; however, in that case, we must have F,(G) > F,,(G). 


The basic property that flow-based mechanisms are value sybilproof but not rank- 
sybilproof can be generalized to include a wide variety generalized flow mechanisms, 


Figure 27.2. Node (a) improves its ranking by adding a sybil (a’) under max-flow. 
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such as those with “leaky pipes.” Similarly, it can be shown that generalized path- 
based methods are value and rank-sybilproof and only path-based methods are rank- 
sybilproof in a large class of reputation mechanisms. 

Lastly, we note that there are many open questions in this area. For example, while 
both PageRank and max-flow mechanisms are not rank sybilproof in the worst case, 
they are very useful reputation systems, and might be less manipulable on average. A 
precise formulation and analysis of this question is still open. For example, about half 
the pages on the Web could double their PageRank using only a single sybil. 


27.6 Conclusion and Extensions 


Reputations provide one of the most successful incentive mechanisms, and reputation 
systems are widespread on the Internet today. However, many reputation systems find 
themselves constantly under attack, and have to resort to fixing strategic problems after 
they are detected. In particular, many reputation systems are engaged in a constant 
arms race against attackers, where the systems change their ranking procedure and the 
attackers experiment until they find a weakness. 

We believe that theoretical results on what can and cannot be accomplished by 
reputation systems, as well as provably secure system designs, would very useful. In 
this chapter, we have described three components of this theory; several other directions 
have been explored, and much research remains to be done. 


27.6.1 Extensions and Open Problems 


Distributed reputation systems. Up to this point, we have considered that users may 
strategically manipulate the feedback they provide or the identities they use, but we 
have implicitly assumed that they cannot directly manipulate the way in which the 
feedback is aggregated or the content of other users’ feedback. This is a reasonable 
assumption as long as the users do not have any control over the communication 
medium or the server(s) used to compute the reputations. However, many proposed 
applications of reputation systems are settings, such as peer-to-peer applications or 
wireless ad hoc networks, in which these assumptions might be violated: there is no 
neutral trusted party to compute reputations, and users might be able to intercept each 
other’s messages. 

This has led many researchers to study distributed reputation systems in which 
the reputations are computed by the users themselves, but measures are adopted to 
minimize the risk of manipulation. One fundamental technique is to use replication: 
The same computation is performed at multiple nodes, and there are protocols to 
detect inconsistencies in the results. Similarly, if the users control portions of the 
communication network, it may be possible to send messages along multiple redundant 
paths so that no user can block or modify communication between two other users. 

Much work remains to be done in this area. In particular, the redundancy technique 
is vulnerable to collusive attacks; the main design approach is to make these attacks 
difficult by requiring that a large number of users collude. This may be compromised 
by the existence of pseudonyms and sybil attacks. 
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Dynamic attacks. The basic model we have studied assumes that a user has full 
knowledge of which online identity she is interacting with. In some applications, it 
may be possible for users to claim credit for an interaction that another user executed, 
or to freeride by copying another user’s actions. For example, if the contribution being 
measured is the number of puzzles a user solves, or the quality of ratings she gives to 
online articles, she may be able to garner a high reputation simply by copying another 
user. 

On the other hand, dynamics may restrict the range of attacks in some settings. 
For example, in a P2P system a peer cannot divide incoming links among its sybils 
arbitrarily, since one needs an interaction to obtain a link and a low ranked sybil might 
have difficulty finding (nonsybil) partners. 


Metrics and benchmarks. Strategic analysis of reputation systems often takes the 
form of proving robustness against attacks. While robustness against attacks is certainly 
desirable, we should not lose sight of the performance of the reputation system. In the 
extreme, a system in which everybody has zero reputation would be perfectly secure 
but completely useless. We need to develop metrics (or empirical benchmarks) of how 
well a particular aggregation method serves the users’ information needs. One approach 
which has been taken is to formulate the performance in terms of an economic welfare 
measure, but a more direct formulation may be valuable. 


Drawing on other social sciences. We have concentrated on economic and game 
theoretic approaches to reputation. Reputation has also been studied in sociology and 
social psychology, especially in the form of the broader, but clearly related, notion of 
trust. Insights from this literature are valuable in the design of reputation systems. 


Putting it all together. The major challenge in reputation systems is to design a system 
that coherently puts together all the ideas that have been explored, including accurate 
feedback elicitation, robustness to whitewashing and sybil attacks, and distributed 
computation. This remains the key challenge for the reader! 


27.7 Bibliographic notes 


Below we provide pointers to relevant literature. Our list is meant to provide access 
to the literature and is certainly not comprehensive, i.e., for each topic we give one or 
two representative publications from which the reader can iterate the reference finding 
process. 

Several chapters in this book extend our discussion, both providing a more detailed 
introduction to game theory, and discussing some examples on reputation systems. 
In particular, Chapter 23 on incentives in peer-to-peer systems includes a detailed 
discussion on the use of reputation systems in peer-to-peer environments. 

There is a large literature on economic models of reputation. The following 
classic articles provide some foundations: Kreps and Wilson (1982), Milgrom and 
Roberts (1982), Fudenberg and Levine (1989), and Kandori (1992). Tadelis (1999) con- 
siders trading reputations, and shows that it is not always undesirable. Dellarocas (2001) 
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analyzes the economic efficiency of different feedback aggregation mechanisms. For 
broad overviews of this area, see Dellarocas (2003) and Resnick et al. (2000). 

Our presentation of whitewashing follows Friedman and Resnick (2001). That paper 
includes a detailed proof that no equilibrium can yield substantially more cooperation 
than the Paying Your Dues equilibrium. Also see Lai et al. (2003), which introduced 
the term whitewashing. 

Recently, the robustness of reputation systems to manipulation has attracted consid- 
erable research. The peer-prediction method to elicit honest feedback was originally de- 
scribed in an article by Miller et al. (2005). See Cooke (1991, p. 139) and Selten (1998) 
for a discussion of strictly proper scoring rules. Jurca and Faltings (2006) study mod- 
ifications to the scoring rule to reduce the total expected payment. Bhattacharjee and 
Goel (2006) treat the revenues generated by a set of ratings as an objective indica- 
tor of the quality of the ratings. They provide an algorithm for dividing the revenues 
among raters in a way that creates incentives for entities to correct errors in the current 
community rating consensus. 

Maintaining reputations for raters can provide signals about rater quality, in addi- 
tion to incentives for good performance. Awerbuch and Kleinberg (2005) describe an 
algorithm that agents can use to learn who the good raters are. Their solution is robust 
to malicious as well as strategic attackers, provided that there are some altruistic raters 
who will rate accurately without incentives. 

Many researchers have presented transitive-trust approaches to calculating reputa- 
tions; a general framework using path algebras is described by Richardson et al. (2003). 
Altman and Tenneholtz (2006) study reputation systems from an axiomatic point of 
view, and present many possibility and impossibility results of the same flavor found in 
Section 27.5.1. Chien et al. (2003) prove that PageRank is monotonic. Our presentation 
of the sybilproofness of reputation systems follows Cheng and Friedman (2005). Many 
proposed solutions to the sybil attack implicitly or explicitly use the idea of a seed to 
break the symmetry of the reputations; for example, see Gyéngyi et al. (2004). The 
Advogato metric proposed by Levien (2004) also falls in this category. An alternative 
approach is described by Goel et al. (Zhang et al., 2004; Bhattacharjee and Goel, 2005). 
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Exercises 


For context, each problem is preceded by the number of the relevant section. 


27.1 (27.2) Verify that if the stage game payoff is constant, the (discounted) average 
payoff per round equals that constant. That is, if pif = c then 77 = c. 


27.2 (27.2) The well-known “tit-for-tat” (TFT) strategy can be defined as: in round / play 
the strategy that your opponent played in round i — 1, starting with C. Show that 
TFT, played by all players, is not an SPNE for any 6 < 1. 


27.3 (27.2) Recall our definition of the Grim strategy: play C unless some player has 
played D ina previous round. Explain why it should not be defined in the apparently 
equivalent manner: “Play C unless the other player has played D in a previous 
round.” (Hint: SPNE strategies need to optimal even on play paths that should not 
arise!) 


27.4 


27.5 


27.6 


27.8 


27.9 
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(27.3) Verify that PYD is indeed an SPNE. In particular, show that deviating from 
the PYD strategy by playing D instead of C is not profitable when 6 > 1/2. (Hint: 
Argue that, no matter the reputation of the deviator’s partner in the next round, she 
could get a payoff 2 higher if her own reputation is good than if it is bad.) 


(27.3) Compute the equilibrium conditions for personalized-grim. (Hint: Consider 
each personalized game as a separate game where players only play in some 
randomly chosen periods.) 

(27.4) Suppose that a rater / can see the ratings of a rater j (j #r(i)) before 
she submits her rating. Suppose that / was paid off according to the scoring rule 
T(x" |x!) defined in equation 27.2. Construct an example in which honest rating 
is not always optimal for /. 

27.4) Consider a situation with two events h and /, in which a player is asked 
to report her belief 6 about the probability of h. The quadratic scoring rule is 
defined by T(h|p) = a + 2bp — bIp’ + (1 — 67], TU|p) =a + 2(1 — fp) — bi’ + 
1 — f)*], where a and bare constant parameters. Show that the quadratic scoring 
rule is a proper scoring rule. Derive upper and lower bounds on the player’s score 
in terms of the parameters). 


27.5) Modify the assumptions in the sybilproofness argument for a specific setting 
and check which of the results are changed. (For example, assume that incoming 
trust edges cannot be moved, as would be the case for Web page ranking.) 


(27.5) Compute the probability that a sybil changes the rank ordering of two nodes 
for a randomly generated trust graph for the ranking procedures discussed. (Choose 
any random model you like and either try to prove a general result or explicitly 
compute for a small, 3-5 node, graph.) 


CHAPTER 28 


Sponsored Search Auctions 


Sébastien Lahaie, David M. Pennock, Amin Saberi, 
and Rakesh V. Vohra 


Abstract 


One of the more visible means by which the Internet has disrupted traditional activity is the manner 
in which advertising is sold. Offline, the price for advertising is typically set by negotiation or posted 
price. Online, much advertising is sold via auction. Most prominently, Web search engines like Google 
and Yahoo! auction space next to search results, a practice known as sponsored search. This chapter 
describes the auctions used and how the theory developed in earlier chapters of this book can shed 
light on their properties. We close with a brief discussion of unresolved issues associated with the 


sale of advertising on the Internet. 


28.1 Introduction 


Web search engines like Google and Yahoo! monetize their service by auctioning off 
advertising space next to their standard algorithmic search results. For example, Apple 
or Best Buy may bid to appear among the advertisements — usually located above 
or to the right of the algorithmic results — whenever users search for “ipod.” These 
sponsored results are displayed in a format similar to algorithmic results: as a list of 
items each containing a title, a text description, and a hyperlink to the advertiser’s Web 
page. We call each position in the list a s/ot. Generally, advertisements that appear 
in a higher ranked slot (higher on the page) garner more attention and more clicks 
from users. Thus, all else being equal, merchants generally prefer higher ranked slots 
to lower ranked slots. Figure 28.1(a) shows an example layout of sponsored search 
results for the query “las vegas travel.” Figure 28.1(b) shows the advertisers’ bids in 
the corresponding auction. 

Advertisers bid for placement on the page in an auction-style format where the 
larger their bid the more likely their listing will appear above other advertisements on 
the page. By convention, sponsored search advertisers generally pay per click, meaning 
that they pay only when a user clicks on their advertisement, and do not pay if their 
advertisement is displayed but not clicked. Overture Services, formerly GoTo.com and 
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1. Papillon Las Vegas Helicopter Travel 
Papillon Helicopters from Las Vegas and Grand Canyon. Offer 


+ Papillon Las Vegas Helicopter Travel Vegas night flights, shows, VIP services, free 2 for 1 offers. 
www .papillon.com - Papillon Helicopters from Las Vegas and Grand Canyon. www.papillon.com 

+ Las Vegas Baymont (Advertiser's Max Bid: $1.10) 
www.baymontinns.com - Find great deals for Baymont Inn, Make your reservation online now 

+ Discount Las Vegas Vacation Package 2. La oa Se 7 ; 
www.saharavegas.com - Las Vegas travel. Discount vacation packages for all budgets - rooms Find great deals for Baymont Inn. Make your reservation onlir 
start at $43. www. baymontinns.com 


MGM MIRAGE Las Vegas - Official Site (dveriser's Mex Big? S245) 


www.mgmmirage.com - Book our hotels: Mandalay Bay, Monte Carlo, Luxor, Excalibur & more. 


3. Discount Las Vegas Vacation Package 


v7 Las Vegas, NV Tourist Guide Las Vegas travel. Discount vacation packages for all budgets 
More: Find a Las Vegas Business - Maps & Traffic - Weather www.saharavegas.com 
Yahoo! Shortcut (Advertiser's Max Bid: $1.09) 
1, Vegas.com . 
Guide to entertainment, attractions, night life, hotels, and gambling in Las Vegas. 4. Las Vegas Vacation Hotel Deal ‘ 
Category: Las Vegas > Local Travel Guides Official site - new Vegas Hilton. Rates starting at $49.95. Boc 
| Saved by 415 people www. /vhilton.com 
www.vegas.com = 58k = ~ More from this si (Advertiser's Max Bid: $1.05) 
. re ie 
(a) Search results (b) Advertisers’ bids 


Figure 28.1. (a) An example display of sponsored search listings above the regular algorithmic 
listings for the query “las vegas travel.” The ordering of sponsored listings is determined via a 
continuous auction mechanism. (b) The top advertisers’ bids (maximum willingness to pay per 
click) in the auction. 


now owned by Yahoo! Inc., is credited with pioneering sponsored search advertising. 
Overture’s success prompted a number of companies to adopt similar business models, 
most prominently Google, the leading Web search engine today. Sponsored search is 
one of the fastest growing, most effective, and most profitable forms of advertising, 
generating roughly $7 billion in revenue in 2005 after nearly doubling every year for 
the previous 5 years. 

The sponsored search industry typically runs separate auctions for each search query: 
for example, the queries “plasma television” and “investment advice” are associated 
with two distinct auctions. The entity being sold in each auction is the right to appear 
alongside the results of that search query. As mentioned, bids are expressed as a 
maximum willingness to pay per click. For example, a 40-cent bid by HostRocket 
for “Web hosting” means HostRocket is willing to pay up to 40 cents every time a 
user clicks on their advertisement. Advertisers may also set daily or monthly budget 
caps. In practice, hundreds of thousands of advertisers compete for positions alongside 
several millions of search queries every day. Generally the auctions are continuous and 
dynamic, meaning that advertisers can change their bids at any time, and a new auction 
clears every time a user enters a search query. In this way advertisers can adapt to 
changing environments, for instance by boosting their bids for the query “buy flowers” 
during the week before Valentine’s Day. The search engine evaluates the bids and 
allocates slots to advertisers. Notice that, although bids are expressed as payments per 
click, the search engine cannot directly allocate clicks, but rather allocates impressions, 
or placements on the screen. Clicks relate only stochastically to impressions. 

Advertising in traditional media is typically sold on a per-impression basis, or 
according to the (estimated) number of people exposed to the advertisement, in part 
because of the difficulty of measuring and charging based on the actual effectiveness 
of the advertisement. Traditional (offline) advertising, and to a large extent banner 
advertising on the Web, is usually priced via an informal process of estimation and 
negotiation. The Web’s capability for two-way communication makes it easy to track 
some measures of effectiveness, in particular user clicks. Many advertisers, especially 
direct marketers looking to close a sale as opposed to brand advertisers, prefer to 
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pay per click rather than per impression, alleviating some of the uncertainty inherent 
in an impression. More direct performance-based pricing is possible by charging per 
“action” or per conversion (sale) on the merchant’s site. 

Search engines are an information gateway to many search and decision-making 
tasks. Industry surveys report that more than 50% of Web users visit a search engine 
every day, Americans conduct roughly 6 billion Web searches per month, over 13% 
of traffic to commercial sites is generated by search engines, and over 40% of product 
searches on the Web are initiated via search engines. As a result, entire niche industries 
exist touting services to boost a Web page’s ranking on the popular search engines, 
in part by reverse engineering the search engines’ information retrieval algorithms. 
Research has shown that good placement on a search page leads to high traffic, and 
eventually an increased financial payoff. Paying for sponsored slots is an alternative 
means of obtaining prominent positioning. Sponsored search works because users 
often tolerate or even welcome targeted advertisements directly related to what they 
are actively searching for. For example, Majestic Research reports that as many as 17% 
of Google searches result in a paid click, and that Google earns roughly nine cents on 
average for every search query they process. Today, Internet giants Google and Yahoo! 
boast a combined market capitalization of over $150 billion, largely on the strength 
of sponsored search. PricewaterhouseCoopers and the Interactive Advertising Bureau 
estimate that in 2005, industry-wide sponsored search revenue in the United States 
reached $5.1 billion, or 41% of total U.S. Internet advertising revenues and 2% of 
all U.S. advertising revenues. Roughly 85% of Google’s $4.1 billion in 2005 revenue 
and roughly 45% of Yahoo!’s $3.7 billion in 2005 revenue is likely attributable to 
sponsored search. A number of other companies — including eBay (Shopping.com), 
FindWhat, InterActiveCorp (Ask.com), LookSmart, and Microsoft (MSN.com) — earn 
hundreds of millions of dollars in sponsored search revenue annually. 

The goal of this chapter is to formally model and analyze various mechanisms used 
in this domain and to study potential improvements. In Section 28.2, we briefly describe 
existing mechanisms used to allocate and price sponsored search advertisements. Sub- 
sequently in Sections 28.3 and 28.4 we discuss formal models used to analyze the prop- 
erties of these auctions. Section 28.5 discusses further extensions and open problems. 


28.2 Existing Models and Mechanisms 


Typically, in sponsored search mechanisms, the advertisers specify a list of pairs of 
keywords and bids as well as a total maximum daily or weekly budget. Then, every 
time a user searches for a keyword, an auction takes place among the set of interested 
advertisers who have not exhausted their budgets. 

Focusing on a single auction, let n be the number of bidders and m < n the number 
of slots. The search engine estimates a;;, the probability that a user will click on the 
ith slot when it is occupied by bidder j. The quantity o;; is called a click through rate 
(CTR). It is usually presumed for all j that aj; > oj41,; fori = 1,...,m — 13 


' The assumption that clickthrough rate decays monotonically with lower slots is a distinguishing feature of 
keyword auctions; in particular, it implies that all bidders prefer the first slot to the second, the second slot to 


the third, etc. This allows for more refined equilibrium analyses than in the more general multi-item case. 
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The search engine also assigns a weight w; to each advertiser 7. The weight can 
be thought of as a relevance or quality metric. If agent j bids b;, his corresponding 
score is S; = w;b;. The search engine allocates slots in decreasing order of scores, 
so that the agent with highest score is ranked first, and so on. We assume throughout 
that agents are numbered so that agent j obtains slot 7. An agent pays per click the 
lowest bid necessary to retain his position, so that the agent in slot j pays sj41/wj. 
This weighted bid ranking mechanism includes the two most prominent keyword 
auction designs that have been used in practice: Overture introduced a “rank by bid” 
mechanism (w; = 1) whereas Google uses a “rank by revenue” mechanism (w; = 
a, ;). Both variants are sometimes called generalized second price (GSP) auctions. 
Prior to 2004, Yahoo! used what is now known as a generalized first price (GFP) 
auction. Agents are ranked by bid but each bidder who secures a slot pays their bid per 
click. 


28.3 A Static Model 


The most popular model used to analyze keyword auctions is a static one where 
the private information of bidder j, the expected payoff from a click, vj, is one 
dimensional. The expected payoff to a bidder from not obtaining a slot is assumed to 
be 0. 

Four features of the model deserve comment. The first is its static nature: a 
substantial departure from reality. Since the study of recurrent auctions is rather 
daunting, one may be disposed to accept this simplification. Second, the expected 
payoff per click to a bidder is slot independent. This is tied to the assumption that all 
bidders prefer the top slot to the second slot to the third slot and so on. Some advertisers 
believe that the probability of a click being converted into a purchase is lower in 
the top slot than in the second slot because many clicks on the top slot are made 
in error, or because a searcher who clicks on a lower-ranked slot is more serious in 
their intent to purchase. Although the story sounds plausible, conversion-tracking data 
from Isobar Communications and other sources does not substantiate the hypothesis: 
in reality the top slot appears to convert about as well as other slots. Third, a bidder’s 
value and CTR for a slot does not depend on the identity of other bidders. It seems 
plausible that Avis might value the fact that Hertz is not present in any slot when Avis 
is present. Fourth, CTRs are assumed to be common knowledge. In practice CTRs are 
estimated by the search engine and can be conditioned on many factors, including user 
characteristics and page context. Estimating CTRs is a significantly complex machine 
learning problem for the search engine, including a built-in explore/exploit trade-off. 
Moreover, bidders’ estimates of CTRs may be less accurate since bidders do not have 
access to the same contextual information available to the search engine. The dynamic 
nature of the environment means that CTRs can fluctuate dramatically over small 
periods. 

As usual we assume that bidders are risk neutral and that their utility for a slot can 
be denominated on a common monetary scale. Supplied with copious amounts of salt, 
let us see where this model takes us. 


A STATIC MODEL 703 


28.3.1 Revenue Maximization and Efficiency 


An auctioneer usually has one of two objectives: revenue maximization or allocative 
efficiency. In the static model one knows exactly what auction design will achieve 
either objective. 

If the goal is revenue maximization, the classic result of Myerson (described in 
Chapter 13) applies directly. One simply relabels the allocation variables. In Chapter 13 
Section 13.1.12, the allocation variable, x ;(b), is defined to be the expected quantity re- 
ceived by bidder i who bids b. For our setting, x ;(b) becomes the expected click through 
rate for a bidder who bids b. Basically the generalized Vickrey auction is applied not 
to the actual values, v;, but to the corresponding virtual values. The upshot is that the 
revenue maximizing auction is a generalized Vickrey auction with reserve prices. 

If the goal is allocative efficiency, the generalized Vickrey auction will do the trick. 
The auction is described in Chapters 9 and 11 of this book. The underlying problem 
of finding the efficient allocation in this case is an instance of the maximum weight 
assignment problem. For each slot i and bidder j let x;; = 1 if bidder j is assigned to 
slot i and zero otherwise. The object is to choose x;;’s to solve the following: 


k n 
max YoY aijusaiy (28.1) 
i=1 j=l 
s.t. Soap Sli MHD (28.2) 
j=l 
k 
Swyel VHten (28.3) 
i=1 
xj; = 0 Vi=1,...,k, Vj=l,...,0 (28.4) 


This is equivalent to finding a maximum-weight perfect matching in a bipartite 
graph and hence can be solved in polynomial time. In fact, because the constraint 
matrix of this linear program is totally unimodular, it will have an optimal solution that 
is integral. Any feasible integer solution is called an assignment. 

A single computation of the maximum weight assignment is sufficient to determine 
both the allocation and the generalized Vickrey payments. This is because the Vickrey 
payments lie in the dual to the above linear program. To write down the dual, let p; be 
the dual variable associated with (28.2) and gj; the dual associated with (28.3). 


k n 

min Yo pi+ oq; (28.5) 
i=l j=l 

S.t. Pi + Qj 2 OjV; \ a re Vj =l,...,n (28.6) 
pig; = 0 ViPS Vial VG Scaler (28.7) 


Here p; can be interpreted as the expected payment (CTR times price per click) of the 
bidder obtaining slot i, and q; as the profit of bidder j. The objective in this program 
is to minimize the bidders’ and auctioneer’s profits combined. Among all optimal dual 
solutions, pick the one that minimizes x pi. The corresponding p; is the price that 
the generalized Vickrey auction would set for slot 7. 
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In the special case when the CTRs are bidder independent (i.e., @;; = j1;) there is a 
particularly simple algorithm, called the Northwest corner rule, to find the maximum 
weight assignment. Assign the bidder with the highest value per click to the top slot, 
the bidder with the second highest value per click to the second slot, and so on. In the 
Economics literature this is called an assortative assignment. 

If one objects to the sealed bid nature of the generalized Vickrey auction there are 
ascending implementations available. 

Interestingly, neither of these auctions corresponds to the GFP or GSP auctions. In 
particular, bidding truthfully is not an equilibrium of either the GFP or GSP auctions. 
It is interesting to observe that Google’s promotional material touts their auction as 
a modification of Vickrey’s sealed bid auction for a single item (which it is) and 
concluding, therefore, that bidding sincerely is the correct thing to do (which it is not). 
A similar claim was made with respect to their auction used to sell shares of their 
IPO. They are not the first and quite possibly not the last to make such claims. For 
example, the financial services firm Hambrecht, which pioneered the use of auctions 
to sell IPO’s in 1998, says that their auction design is based on the Vickrey auction for 
a single good. While the Hambrecht auction does specialize to the Vickrey auction for 
a single good, it does not inherit the attractive properties of the Vickrey auction when 
applied to multiple units.” 

To see why one must be careful when generalizing the Vickrey auction to the sale 
of more than one unit, suppose that there are three bidders with v; > v2 > v3 and two 
slots. Also, suppose that a;; = j4; with 4; > 42. If one were to auction off the top slot 
only, by an English ascending auction, each bidder would remain in as long as at the 
current price their surplus is nonnegative. So, if the current price on the top slot is p1, 
bidder j remains active if 4;(v; — pi) > 0. Hence the auction ends at a price p; where 
[i(v2 — pi) = 0, 1.e., pi = v2. Now suppose that both slots are available but we will 
auction off the top slot first followed by the second slot. Let p; be the current price of 
slot 1, p2 = O the current price of slot 2. Now bidder j will remain active in the auction 
for the top slot provided their surplus from the top slot is at least as large the surplus 
they could get from the second slot (which is currently priced at zero). That is, 


iw 
fi(v; — pi) = H2(vj —0) > pi < (1 = 2), 
1 


Therefore the auction on the top slot terminates at a price of (1 — rele < v2. The 
point is that the presence of a second slot lowers the price at which a bidder on the 
top slot will drop out of the auction on the top slot. The generalized Vickrey auction 
incorporates this change in the outside option of a bidder to ensure truthful bidding. 
The GSP auction does not. The generalized Vickrey auction, however, would allocate 
the top slot to bidder 1 and charge her (1 — mL and the second slot to bidder 2 and 
charge her v3. 

As noted above, the GFP and GSP are special cases of what have been called ranking 
auctions. Bids (the reported v;’s) are weighted (weights are independent of the bids) 
and then ranked in the descending order. The highest ranked bidder gets the top slot, 


2 All of this reminds one of what is known as the freshman binomial theorem: (a + b)” = a" + b". True for 


n = | but not forn > 2. 
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the second highest ranked bidder gets the second slot, and so on. The higher the bid the 
higher the slot one obtains (other bids held fixed). Since the assignment of bidders to 
slots is monotonic in the bid (other bids held fixed) it follows from standard results (see 
Section 9.36 of Chapter 9 for example) that there exists a payment rule that will make 
truthful bidding an equilibrium of the resulting auction. That payment rule is described, 
for example, in Section 13.1.2 of Chapter 13. Let x;(b|b_;) denote the expected click 
through rate for agent 7 when she bids b, given the profile of other bids is b_;. Then 
the payment P;(b|b_;) she must make to ensure incentive compatibility is given by 


b 
Pi(b| bj) = bx(b|b-) ~ | x(t |b_;)dt. (28.8) 
0 


These ranking auctions are, in general, neither efficient nor revenue maximizing. 
(Though in the exercises, we explore a special case ranking that is efficient.) The 
payment rules associated with the GFP and GSP are not such as to induce truthful 
bidding as an equilibrium. 


28.3.2 Equilibrium Properties 


The fact that neither the GFP nor GSP is incentive compatible does not imply that they 
are inefficient or suboptimal in terms of revenue. It is possible that the equilibrium 
outcomes of both these auctions may be efficient or revenue maximizing. To identify 
the revenue and efficiency properties of these auctions, it is necessary to determine 
their equilibria. 

The GFP auction does not admit a pure strategy full-information equilibrium but does 
admit a pure strategy Bayes-Nash symmetric equilibrium. The argument is identical to 
that of the sealed bid first price auction for a single good. The equilibrium bid functions 
are monotonic in the value. Therefore the equilibrium allocation of bidders to slots is 
the same as in the efficient allocation. Hence, by the revenue equivalence theorem, the 
symmetric equilibrium is efficient. 

The efficiency of the GFP (in a Bayesian setting) lends it some appeal but this is 
where the “static” assumption has bite. In a dynamic setting, the absence of a pure 
strategy full-information equilibrium encourages bidders to constantly adjust their bids 
from one period to the next. This produces fluctuations in the bids over time and it has 
been argued that these fluctuations resulted in significant inefficiencies. 

To date nothing is known about the Bayesian equilibrium of the GSP auction. 
Assume for simplicity that CTRs are bidder-independent, so a; = ;, and that all 
weights are set to 1. The analysis in this section generalizes straightforwardly to the 
case where CTRs are separable (i.e., a; = 4;8;) and agents are assigned arbitrary 
weights w,. These extensions are developed in the exercises. 

In this case one can show that the GSP is efficient under full information and a re- 
stricted notion of equilibrium called locally envy-free. An assignment x is called locally 
envy-free if there exist prices, { p;}, one for each slot, such that for all 7, j with xj; = 1 


[ivi — Pi = Mi-1V;j — Pi-1 (28.9) 
and 


ivi — Pi = MigiV;j — Pi+1 (28.10) 
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In words, if bidder 7 is assigned to slot 7, then she prefers slot i to the slot just above 
her and the slot just below her. 


Theorem 28.1 An assignment x* is optimal if and only if it is locally envy-free. 


PROOF Suppose first that x* is locally envy-free and let p* be the corresponding 
price vector. It suffices to prove that the assignment x* is assortative. Let j be such 
that Xi = land j’ such that x;+1,;, = 1. To show that the assignment is assortative, 
we must show that v; > v;. From the property of being locally envy-free, we 
have 


Midj — Pi = Mit1Vj — Piay 
and 
Mi4 10; — Pi41 > [jvz — pi. 
Adding them together yields 
(Mi — Mi+1)(vj — vj) = 0. 


Since j4; > [4;+1 it follows from this inequality that vj; > v;. 

Now let x* be an optimal assignment. Let (p*, g*) denote an optimal dual 
solution. It suffices to show that (x*, p*) is locally envy-free. Consider a pair 
(r, j) such that x;; = 1. Complementary slackness and dual feasibility implies 
that u,v; — pe = qj = max;{ljv; — p;}. Therefore 


* * S 
[4pVj — Pp, = max{M,-1Vj — P,_ 1, Mr41Vj — P,44}- 


Theorem 28.2 The GSP has a full information equilibrium that yields an allo- 
cation that is locally envy-free. 


PROOF Order the bidders so that vj > v2 >--- > vy. Let p* be the Vickrey 


price of slot i. Let bidder 1 bid b; = v, and each bidder j > 2 bids bj = = 2 ! 
First we show that under the rules of the GSP, bidder 1 is assigned to slot 1, “wider 
2 to slot 2, and so on. To do this, it suffices to show that b;_; > b;. Since the 
optimal assignment is locally envy-free, we have 


* * 
MjUj — Pj = Mj-10j — Pj-4- 


Therefore 


which implies 
* * * * 
bj-1 = Het > uss > Ei t Gr 1) > Pi = bj. 
, Mj-1 Mj Mj Mj Mj 
Hence if each bidder j bids b; the GSP returns the optimal assignment. It is also 
easy to see that bidder j < m pays pj for their slot. Bidder j > m pays zero. 
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Since each bidder pays their Vickrey price and receives the slot they would have 
under the efficient allocation, no bidder has a unilateral incentive to change their 
bid. Therefore we have an equilibrium that, from Theorem 1, is envy-free. 


Absent the recurrent nature of keyword auctions, they are similar to what are known 
as condominium auctions. In a condominium auction, bidders are interested in pur- 
chasing a condominium in a building. The condominiums are identical except for their 
height above the ground, the side of the building they are located on, etc. If all bidders 
have identical preferences over the condominiums; i.e., everyone prefers to be on a 
higher floor, they coincide with keyword auctions. 


28.4 Dynamic Aspects 


Since these auctions are repeated with great frequency, one should properly model 
them as repeated games of incomplete information. The set of equilibria of such games 
is quite rich and complicated, even when restricted to the setting considered here. A 
full treatment of this case will not be given here. Rather we mention two phenomena 
that arise in this setting. 

One is known as bid rotation. This occurs when competing bidders take turns at 
winning the auction. In our context this might mean bidders take turns at occupying 
the top slot. If bidders are short lived, this is unlikely to be a problem, if not, this will 
lower the auctioneers revenue. 

Another possibility that repetition makes possible is vindictive bidding. In the GSP 
auction one’s bid determines the payment of the bidder in the slot above and not one’s 
own. Therefore one can increase the payment of the bidder in the slot above by raising 
one’s bid without affecting one’s own payment. This may be beneficial if the bidder 
in the slot above is a competitor with a limited budget for advertising. In a dynamic 
environment this encourages a bidder to constantly adjust their bids so as to inflict or 
avoid damage upon or from their competitor. 

Even if one could ignore strategic considerations, a problem remains. The online 
nature of the auctions in sponsored search complicates the computation of an efficient 
allocation. Below we describe one model that addresses this difficulty. 


28.4.1 The Online Allocation Problem 


In this model, the search engine receives the bids of advertisers and their maximum 
budget for a certain period (e.g., a day). As users search for these keywords during 
the day, the search engine assigns their advertisement space to advertisers and charges 
them the value of their bid for the impression of the advertisement.’ For simplicity of 
notation we assume that each page has only one slot for advertisements. The objective 
is to maximize total revenue while respecting the budget constraint of the bidders. Note 
that in this model bidders pay their bid which is counter to practice. On the other hand, 
budget constraints that apply across a set of keywords, a real-world feature, are part of 
the model. 


3 Tf one scales the bids by the CTR, the model would accommodate pay per click. 
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Let n be the number of advertisers and m the number of keywords. Suppose that 
advertiser j has a bid of b;; for keyword i and a total budget of B;. In this context, it 
is reasonable to assume that bids are small compared to budgets, i.e., bj; «< B;. 

If the search engine has an accurate estimate of r;, the number of people searching 
for keyword i for all 1 <i < m, then it is easy to approximate the optimal allocation 
using a simple linear program. Let x;; be the total number of queries on keyword i 
allocated to bidder j. The linear program is 


m n 
max ) ) bij xij 


i=1 jot 
S.t. 2% <1; Vl<i<m (28.11) 
Spee) Vi<j<n 

Mee Vi<i<m, Vl<j<n 


n m 
min ) B,B; + ) Vr, Qj 
j=l i=l 


S.t. a; + bij Bj > bij Vi<i<m,Vl<j<n 


Bb; =0 Vlsa7 sn 
a; >0 V1<i<m 


By complementary slackness, in an optimal solution, advertiser j is assigned to key- 
word i if (1 — B;)bij = maxj<z<n(1 — Bx)bix. Using this property, the search engine 
can use the solution of the dual linear program to find the optimum allocation: every 
time a user searches for keyword 7, the search engine allocates its corresponding ad- 
vertisement space to the bidder j with the highest b;;(1 — 6;). In other words, the bid 
of advertiser j will be scaled down by 1 — £;. 

Now £; represents rate of change of the optimal objective function value of (28.11) 
for a sufficiently small change in the right-hand side of the corresponding constraint. 
In other words, if advertiser j’s budget were to increase by A, the optimal objective 
function value would increase by 6;A. Equivalently, it is the opportunity cost of 
consuming agent j’s budget. Hence, if we allocate keyword i to agent now we obtain 
an immediate ‘payoff’ of b;; . However, this consumes b;; of the budget, which imposes 
an opportunity cost of 6;b;;. Therefore, it makes sense in the optimal solution to (28.11) 
to assign keyword i to j provided b;; — Bj b;; > 0. 

In practice, a good estimate of the frequencies of all search queries is unavailable. 
Queries arrive sequentially and the search engine must instantly decide to allocate their 
advertisement space to bidders without knowledge of the future queries. Therefore, 
what is needed is a dynamic procedure for allocating bidders to keywords that are 
queried. We describe one such procedure and analyze its performance within the usual 
competitive ratio framework. Specifically, we compare the revenue achieved by a 
dynamic procedure that does not know the 7;’s in advance, with the revenue that could 
be achieved knowing the r;’s advance. The revenue in this second case is given by the 
optimal objective function value of the program (28.11). 
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The obvious dynamic procedure to consider is a greedy one: among the bidders 
whose budgets are not exhausted, allocate the query to the one with the highest bid. It 
is easy to see that this approach is equivalent to setting all 6;’s to 0. 

The greedy procedure is not guaranteed to find the optimum solution. It is easy to 
construct a simple example with two bidders and two keywords in which the revenue of 
the greedy algorithm is as small as half of the optimum revenue. For example, suppose 
two bidders each with a budget of $2. Assume that bj; = 2, bj2 =2—€, bo; = 2, 
and bj = €. If query 1 arrives before query 2, it will be assigned to bidder 1. Then 
bidder 1’s budget is exhausted. When query 2 arrives, it is assigned to bidder 2. This 
produces an objective function value of 2 + €. The optimal solution would assign query 
2 to bidder 1 and query 1 to bidder 2, yielding an objective function value of 4. The 
problem with the greedy algorithm is that, unlike the solution to (28.11), it ignores the 
opportunity cost of assigning a query to a bidder. 

One can prove that the revenue of greedy algorithm is at least half of the optimum 
revenue for any instance. In the standard terminology of online algorithms, the com- 
petitive ratio of greedy algorithm is 1/2. Can one do better in terms of competitive 
ratio? Yes. One does so by trying to dynamically estimate the opportunity cost , i.e., 
the 6;’s, of assigning a query to a bidder. This has the effect of spreading the bidders 
expenditures over time. The effect is called “budget smoothing,” and is a feature that 
some search engines offer their advertisers. 

The following modification of the greedy algorithm adaptively updates the §;’s as 
a function of the bidders spent budget. Let 


o(x) =1—e* 1. 


The algorithm sets 8; = 1 — $(f;), where f; is the fraction of the budget of bidder j, 
which has been spent. 


Algorithm 1. Every time a query i arrives, allocate its advertisement space to 
the bidder j, who maximizes b;;6(f;), where f; is the fraction of the bidder j’s 
budget which has been spent so far. 


The revenue of this algorithm is at least 1 — 1/e of the optimum revenue. It is also 
possible to prove that no deterministic or randomized algorithm can achieve a better 
competitive ratio. 


Theorem 28.3. The competitive ratio of Algorithm I is 1 — 1/e. 


We outline the main ideas in the proof of the theorem. Let k be a sufficiently large 
number used for discretizing the budgets of the bidders. We say that an advertiser 
is of type j if she has spent within ae 4] fraction of her budget so far. Let s; be 
the total budget of type j bidders. For i = 0,1,...,k, define w; to be the amount of 
money spent by all the bidders from the interval (4, t] of their budgets.Also define 
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the discrete version of function ¢, 


1 k—s 
P(s) =1— (1 = :) ; (28.12) 
It is easy to see that when & tends to infinity (s) > (;). Let OPT be the solution of 
the optimal off-line algorithm (i.e., the solution of the optimization program (28.11)). 
For simplicity, assume that the optimal algorithm spends all of the budget of the bidders. 
We have the following lemma. 


Lemma 28.4 = At the end of the algorithm, this inequality holds: 


k k 
ye ®(i)s; < ae (i) w; (28.13) 
i=0 i=0 


PROOF Consider the time that query g arrives. Suppose that OPT allocates q 
to a bidder of current type t, whose type at the end of the algorithm will be 1’. 
Let bop, and bag be the amount of money that OPT and the algorithm get from 
bidders for g. Let i be the type of the bidder that the algorithm allocates the query. 
We have 


P(t Dopt < P(t Dopt < P(ibarg. (28.14) 


Now summing the inequality above over all the queries, the left-hand side of 
(28.14) contributes to the sum }°, ®(7)s;, and the right-hand side contributes to 
>= &(i)w;. So the lemma follows. 


Now, we are ready to prove the Theorem 28.3. 


PROOF By definition w; < i ae s;. Using Lemma 28.4, 


k k k 
Y> @(i)s; < : > a) D5; 
i=0 i=0 j=i 


Changing the order of the sums and computing the sum of the geometric series, 
we have 


k k k 
Y> @(i)s; < - S> oi) do 5; 
i=0 i=0 j=i 


< 3 (diew)s 

i=0 ‘j= 

k ; 1 
eD(; + &(i) — (0) 4 o(7))s 


i Lo a 
ars (eo - o(z)) 25 DSO 
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which yields 


k | 
(oo = o(z)) Yaz) oe 
i=0 i=0 


Note that as k goes to infinity the left-hand side tends to (1 — 1)OPT. 
The right-hand side is equal to the revenue of the algorithm. So the theorem 
follows. 


The same algorithm can be applied when multiple advertisement can appear with the 
result of a query or when advertisers enter at different times. At present, the equilibrium 
properties of this allocation rule are unknown. 


28.5 Open Questions 


We close this chapter with a brief review of important issues not directly addressed in 
this chapter. 

While our discussion has focused on existing mechanisms, one should not conclude 
that there is no room for improvement in their design. For example, there is debate over 
the role of the budget constraints in these auction. In many cases they do not appear 
to be hard constraints as bidders frequently adjust them. A bidder can also “expand” 
their budget simply by lowering their bid and paying less per click. Some argue that 
the budget constraint is merely a convenient way to express other desires. For example, 
limiting one’s exposure or spreading one’s advertising over a longer period. All of 
this suggests the need for richer bidding models. Ones that might allow bidders to 
express decreasing marginal value for clicks, or distinct values for traffic from certain 
geographic regions, demographic profiles, etc., support greater allocative efficiency, 
though pose a significant burden in terms of computational and elicitation costs. 

When advertiser payments are based on user clicks, search engines must invest in 
the task of detecting and ignoring robot clicks, spam clicks as well as clicks from an 
advertiser trying to impose costs on their competitor or from an affiliate who actually 
benefits monetarily from additional clicks. For this reason there is interest in exploring 
alternate pricing conventions. The most compelling is pay per action or conversion. The 
advertiser pays only if a click results in a sale, for example. This raises new incentive 
issues associated with tracking sales. 

The models in this chapter, as do most analyses in the literature, assume a monopoly 
search engine with a static user base. This would be an appropriate model if switching 
costs for advertisers and users were high. In fact, switching costs for many advertisers 
are low; many advertisers work with both Google and Yahoo! simultaneously, or work 
with third-party search engine marketers to manage their account across multiple 
search engines. Switching costs for users are essentially zero: to patronize a different 
search engine, users need merely type a new address into their web browser.* The 


4 Personalization features may begin to introduce moderate switching costs for users. For now, reputation and 
branding seem to play a major role in search engine loyalty: blind relevance tests show little or no difference in 


quality among major search engines. 
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competitive pressures to retain advertisers able to switch advertisement networks or use 
multiple networks may cause firms to focus less on extracting the maximum revenue 
from advertisers possible and more on attracting and retaining advertisers. Similarly, 
search engines must make trade-off decisions between maximizing current period rev- 
enue and attracting and retaining users in the long term. For this reason it would be very 
instructive to understand the properties of keyword auctions in competition with each 
other. 

The major search engines syndicate their advertisements to affiliate search engines 
and content providers. For example, Google, through its AdSense program, syndicates 
advertisements to AOL, MySpace, and thousands of other Web sites. The introduction 
of affiliates greatly complicates the semantics of bidding and allocation. 

We have assumed that CTRs are given. In practice, CTRs are learned over time 
and can depend on a variety of factors such as bidder identity; advertisement identity 
and content; user characteristics, including demographics, location, and history; and/or 
page context including other advertisements and algorithmic results. Learning CTRs 
poses an explore/exploit trade-off: the auctioneer can exploit known high-CTR ad- 
vertisements, or explore new advertisements or infrequently shown advertisements to 
uncover even higher-CTR advertisements. The auctioneer’s CTR estimate may differ 
from the bidder’s estimate; in particular, the auctioneer usually has more contextual 
information to learn from. 

In this chapter, we have focused on the auctioneer’s mechanism design problem. 
The advertiser’s bidding optimization problem is also challenging and the focus of a 
great deal of commercial and research activity. 


28.6 Bibliographic Notes 


The growth of paid placement has attracted recent research on this topic. Hoffman 
and Novak (2000) discuss the trend in Internet advertising toward per-click pricing 
rather than the traditional per-impression model. A good discussion of the practice of 
sponsored search is available on the Web at http: //searchenginewatch.com/ 
webmasters/paid.html. 

Computing the explicit form of incentive compatible payments for ranking auctions 
is carried out in Aggarwal et al. (2006) and Iyengar and Kumar (2006). The Bayesian 
equilibrium of the GFP is derived in Lahaie (2006). The details of the revenue max- 
imizing auction for (static) slot auctions is derived in Feng (2005) and Iyengar and 
Kumar (2006). The envy-free analysis of the static model is due to Edelman et al. (in 
press). A similar analysis can be found in Varian (in press). The latter paper shows 
how upper and lower bounds on bidders’ actual values can be derived given their bids. 
Feng et al. (2006) explore four ranking algorithms via simulation. All of these results 
would apply to condominium auctions as well; see Burguet (2005) for a discussion of 
condominium auctions. 

The Northwest comer rule for the assignment problem dates back to Monge (1981). 
Ascending implementations of the Vickrey auction for the static model can be found 
in Crawford and Knoer (1981) and Demange, Gale, and Sotomayor (1986) (which 
is a variant of the Hungarian algorithm for solving the assignment problem). The 
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auction of Demange, Gale, and Sotomayor was dubbed, in Edelman et al. (in press), 
the generalized English auction. 

The online allocation problem studied in Section 28.4.1 is proposed and analyzed by 
Mehta et al. (2005). This problem is a generalization of the online bipartite matching 
problem studied by Karp et al. (1990) and Kalyanasundaram and Pruhs (2000). More 
recently Buchbinder et al. (2006) gave a primal-dual algorithm and analysis for the 
problem given in Mehta et al. They also extended that framework to scenarios in which 
additional information is available, yielding improved worst-case competitive factors. 

Mahdian et al. (2006) study the online allocation problem when the search engine 
has a somewhat reliable estimate of the number of users searching for a keyword 
everyday. Mahdian and Saberi (2006) study multiunit auctions for perishable goods, in 
a setting where the supply arrives online. They motivate their model by its application 
to sponsored search. Abrams (2006) and Borgs et al. (2005) design multiunit auctions 
for budget-constrained bidders, which can be interpreted as slot auctions, with a 
focus on revenue optimization and truthfulness. For a discussion of vindictive bidding 
and some of the dynamic aspects of slot auctions see Asdemir (2006) and Zhou and 
Lukose (2006). 

Weber and Zheng (2006) study the implementation of paid placement strategies, and 
find that the revenue-maximizing search engine design bases rankings on a weighted 
average of relative quality performance and bid amount. Hu (2003) uses contract 
theory to show that performance-based pricing models can give the publisher proper 
incentives to improve the effectiveness of advertising campaigns. Rolland and Patterson 
(2003) propose a methodology, using expert systems to improve the matching between 
advertisers and Web users. 

Besides the optimal ranking mechanism, the search engine must also choose the 
number of paid slots by finding the optimal trade-off between sponsorship and user 
retention. Bhargava and Feng (2002) provide a theoretical model to explain and analyze 
this trade-off. 

The problem of learning CTRs is nontrivial and presents an explore/exploit trade- 
off. Pandey and Olston (2006) formulate the problem as an appropriate multiarmed 
bandit optimization; Gonen and Pavlov (2007) derive a bandit optimization algorithm 
that retains incentive compatibility for bidders. 

Several authors explore the advertiser’s bidding optimization problem (Borgs et al., 
2005; Cary et al., 2007; Kitts et al., 2005; Kitts and LeBlanc, 2004; Rusmevichientong 
and Williamson, 2006). Kitts et al. (2005) provide evidence that the first slot does not 
have an appreciably lower conversion rate than the second slot as some advertisers 
believe. 
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Exercises 


28.1 Consider the model of keyword auctions where the CTR of agent / in slot i is 4;. 
Is every full-information equilibrium of the GSP locally envy-free? 


28.2 Consider the model of keyword auctions where the CTR of agent / in slot i is 
4; Bj; i.e.; the CTR is separable into a bidder effect 6; and a position effect j;. 
Suppose also that 41 > 2 > ++: > fm. Give a simple algorithm for determining 
the efficient allocation of bidders to slots. Derive the payment rule implied by the 
VCG mechanism for this environment. 


28.3 In the model of the previous exercise, suppose also that the auctioneer assigns a 
weight w; = w;(B;) to each bidder; weights may depend on the bidder effects, but 
not on their bids. Suppose bidders are assigned to slots by decreasing order of their 
scores w;b;. Use formula (28.8) to derive the payment rule that combined with the 
allocation rule just described would yield an incentive compatible mechanism. 


28.4 Consider the model of keyword auctions where the CTR of agent / in slot / is 
i; Bj; i.e., the CTR is separable into a bidder effect 6; and a position effect w;. The 
auctioneer sets weights w; = 8;, and a bidder pays the lowest amount necessary 
to retain his position. 


(a) Give the inequalities that characterize a full-information (Nash) equilibrium 
in this model. Strenghten them to give the inequalities for a locally envy-free 
equilibrium. 

(b) Show that in a locally envy-free equilibrium, bidders are ranked in order of 
decreasing 8;v;. 

(c) From among the set of locally envy-free equilibria, exhibit the one that yields 
the smallest possible revenue to the auctioneer. 
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28.5 


28.6 
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Consider the model of keyword auctions where the CTR of agent / in slot i is 
uj. Give an example of where the GFP auction does not admit a pure strategy 
full-information equilibrium. For simplicity, you may assume a discretized set of 
allowable bids. 

Consider the online allocation problem discussed in Section 28.4. Show that the 
competitive ratio of the algorithm remains the same even if the optimum solution 
does not exhaust all the budgets. 


CHAPTER 29 


Computational Evolutionary 
Game Theory 


Siddharth Suri 


Abstract 


This chapter examines the intersection of evolutionary game theory and theoretical computer science. 
We will show how techniques from each field can be used to answer fundamental questions in the 
other. In addition, we will analyze a model that arises by combining ideas from both fields. First, we 
describe the classical model of evolutionary game theory and analyze the computational complexity 
of its central equilibrium concept. Doing so involves applying techniques from complexity theory to 
the problem of finding a game-theoretic equilibrium. Second, we show how agents using imitative 
dynamics, often considered in evolutionary game-theory, converge to an equilibrium in a routing 
game. This is an instance of an evolutionary game-theoretic concept providing an algorithm for 
finding an equilibrium. Third, we generalize the classical model of evolutionary game theory to a 
graph-theoretic setting. Finally, this chapter concludes with directions for future research. Taken as 
a whole, this chapter describes how the fields of theoretical computer science and evolutionary game 
theory can inform each other. 


29.1 Evolutionary Game Theory 


Classical evolutionary game theory models organisms in a population interacting and 
competing for resources. The classical model assumes that the population is infinite. It 
models interaction by choosing two organisms uniformly at random, who then play a 
2-player, symmetric game. The payoffs that these organisms earn represent an increase 
or a loss in fitness, which either helps or hinders the organisms ability to reproduce. 
In this model, when an organism reproduces, it does so by making an exact replica of 
itself, thus a child will adopt the same strategy as its parent. 

One of the fundamental goals of evolutionary game theory is to characterize which 
strategies are resilient to small mutant invasions. In the classical model of evolutionary 
game theory, a large fraction of the population, called the incumbents, all adopt the 
same strategy. The rest of the population, called the mutants, all adopt some other 
strategy. The incumbent strategy is considered to be stable if the incumbents retain 
a higher fitness than the mutants. Since the incumbents are more fit, they reproduce 
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more frequently and the fraction of mutants in the population will eventually go to 
0. Put another way, an evolutionarily stable strategy (ESS) is a strategy such that if 
all the members of a population adopt it, then no mutant strategy could overrun the 
population. We shall see in Section 29.1.1 that ESS are a refinement of Nash equilibria. 

Replication is not the only type of dynamic studied in evolutionary game theory. 
Imitation is another widely studied dynamic. In imitative dynamics, each agent initially 
plays some pure strategy. As time goes on, agents interact pairwise. After this pairwise 
interaction, if one agents sees the other agent earned a higher payoff, the agent with 
the lower payoff may adopt, or imitate, the strategy of the agent who earned the higher 
payoff. Imitative dynamics model, for example, anew idea, innovation, or fad spreading 
through a population of individuals or firms. 

In general, there are two main characteristics common to most evolutionary game 
theoretic models. The first is that the population is infinite. The second is that players 
adopt a very simple, local dynamic, such as replication or imitation, for choosing 
and updating their strategies. These dynamics result in the agents learning from the 
other agents in their environment; they provide a method for an equilibrium strategy 
to emerge from the population. These types of dynamics explain how a population can 
converge to an equilibrium. For example, Section 18.3.1 shows that equilibria for the 
nonatomic selfish routing game exists, whereas Section 29.3 will show how agents 
obeying imitative dynamics can converge to it. 

Next we will formally describe the basic model of evolutionary game theory. Then, 
in Section 29.2, we will analyze the computational complexity of finding and recog- 
nizing stable strategies. After that, in Section 29.3, we will see an example of imitative 
dynamics. We will apply imitative dynamics to the problem of selfish routing and show 
how agents converge to an equilibrium. Finally, in Section 29.4, we will examine the no- 
tion of stable strategies in a context where agents play against their local neighborhood 
in a graph, as opposed to playing against another agent chosen uniformly at random. 


29.1.1 The Classical Model of Evolutionary Game Theory 


The classical model of evolutionary game theory considers an infinite population of 
organisms, where each organism is assumed to be equally likely to interact with each 
other organism. Interaction is modeled as playing a fixed, 2-player, symmetric game 
defined by a fitness function F (we emphasize that the same game F is played in 
all interactions). Let A denote the set of actions available to both players, and let 
A(A) denote the set of probability distributions or mixed strategies over A, then 
F: A(A) x A(A) > NX. If two organisms interact, one playing a mixed strategy s and 
the other playing a mixed strategy t, the s-player earns a fitness of F(s|t) while the 
t-player earns a fitness of F(t|s). 

In this infinite population of organisms, suppose that there is a 1 — € fraction who 
play strategy s, and call these organisms incumbents, and suppose that there is an € 
fraction who play f, and call these organisms mutants. Assume that two organisms are 
chosen uniformly at random to play each other. The strategy s is an ESS if the expected 
fitness of an organism playing s is higher than that of an organism playing f, for all 
t #5 and all sufficiently small €. Since an incumbent will meet another incumbent 
with probability 1 — € and it will meet a mutant with probability €, we can calculate the 
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expected fitness of an incumbent, which is simply (1 — €) F(s|s) + € F(s|t). Similarly, 
the expected fitness of a mutant is (1 — €)F(t|s) + € F(¢|t). Thus we come to the formal 
definition of an ESS. 


Definition 29.1 A strategy s is an evolutionarily stable strategy (ESS) for 
the 2-player, symmetric game given by fitness function F’,, if for every strategy 
t # s, there exists an e, such that for allO < € < «, (1 —€)F(s|s) + €F(s|t) > 
(1 —e)F(t|s) + €F(¢|t). 


If one assumes that each organism reproduces asexually, and spawns a number 
of offspring proportional to its fitness, then stable strategies will be those where the 
incumbent population will reproduce more than any small mutant invasion. Thus the 
mutant invasion will have fewer offspring and, in the long run, the fraction of mutants 
in the population will tend to 0. In fact, a continuous time analysis of the replicator 
dynamics shows that every ESS is asymptotically stable. 

Definition 29.1 holds if and only if either of two conditions on s is satisfied Vt # s: 
(1) F(s|s) > F(t|s), or (2) F(s|s) = F(t|s) and F(s|t) > F(t|t). A consequence of 
this alternate formulation of an ESS is that for s to be an ESS, it must be the case 
that F(s|s) > F(t|s), for all strategies t. This inequality means that s must be a best 
response to itself, and thus for any ESS s, the strategy profile (s, s) must also be a Nash 
equilibrium. This results in another, equivalent way to define an ESS. 


Theorem 29.2 A strategy s is an ESS for a 2-player, symmetric game given by 
fitness function F, if and only if (s,s) is a Nash equilibrium of F , and for every 
best response t tos, where tt # s, F(s|t) > F(t|t). 


In general the notion of ESS is more restrictive than Nash equilibrium, and not all 
2-player, symmetric games have an ESS. 

Next, we give an example of a 2-player, symmetric game called Hawks and Doves, 
and then solve for its ESS. The game of Hawks and Doves models two organisms 
fighting over a resource. Obtaining the resource results in a gain of fitness of V, while 
fighting for the resource and losing results in a fitness decrease of C. If a Hawk plays 
a Dove, the Hawk will fight for the resource and the Dove will give up. This results in 
a Hawk earning in increase of fitness of V, and the Dove’s fitness staying the same. If 
two Doves play each other, they split the resource earning them both a fitness increase 
of V/2. If two Hawks play, eventually one will win and one will lose, and it assumed 
that each organism has a 1/2 chance of being the winner. Figure 29.1 shows the payoff 
matrix for this game. 

The strategy profile (D, D) is not a Nash Equilibrium because one player could 
unilaterally deviate and play H and increase its payoff from V/2 to V. Since (D, D) is 


H D 
H|(V-C)j2 V 
D 0 V/2 


Figure 29.1. The game of Hawks and Doves. 
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not a Nash Equilibrium, D cannot be an ESS. Now, if V > C then H is an ESS. To see 
this observe that F(H|H) = (V — C)/2. Let t be any mixed strategy with probability 
p <1 of playing H and 1 — p of playing D, then F(t|H) = ps +(1— p)0 < 
(V —C)/2. Since F(H|H) > F(t|H) for all t 4 H, H is an ESS. We leave it as an 
exercise for the reader (see Section 29.6) to show that if V < C, the mixed strategy of 
playing H with probability V/C and D with probability 1 — V/C is an ESS. Observe 
that as V — C, the probability of playing H approaches 1. This coincides with the 
pure strategy ESS of playing H when V > C. 


29.2 The Computational Complexity 
of Evolutionarily Stable Strategies 


Next we show the computational complexity of finding an ESS given a 2-player 
symmetric game is both NP-hard and coNP-hard. To prove this, we will make a 
reduction from the problem of checking if a graph has a maximum clique of size 
exactly k. Prior work has shown that this problem is both NP-hard and coNP-hard. 
Along the way to proving the hardness of finding an ESS, we will see that the problem 
of recognizing whether a given strategy is an ESS is also coNP-hard. 

Next we will give the intuition behind the reduction. The reduction will transform 
a graph G into a payoff matrix F which will have an ESS if and only if the size of the 
largest clique in G is not equal to k. The reduction transforms the adjacency matrix 
of G into the payoff matrix F by replacing all the diagonal entries with the value 1/2, 
inserting a Oth row with each entry having a constant value, and inserting a Oth column 
with each entry having the same constant value. 

Informally speaking, for a mixed strategy s to be an ESS, incumbents should receive 
a relatively high payoff when playing other incumbents. In order for a strategy s to 
have this property for the game F’,, when s plays itself it must guarantee that the pure 
strategies chosen will correspond to two adjacent vertices. One can see that having a 
mixed strategy with support over a clique will achieve this. We will show in Lemma 29.3 
that having support over a clique will result in a higher payoff than having support over 
a dense subgraph that is not a clique. Having the diagonal entries consist of the constant 
1/2 will help us prove this. This lemma will allow us to prove that when the size of 
the maximum clique is greater than k, the uniform mixed strategy corresponding to 
vertices of the clique will be an ESS. In addition, setting the Oth row and column of 
F to a carefully chosen constant will give us a pure strategy ESS in the case where 
the size of the maximum clique is less than k. This constant will also allow us to 
show that there is no ESS in the case where the size of the maximum clique in G is 
exactly k. 

In describing this reduction, and for the rest of this chapter, we use the notation 
F(s|t) to denote the payoff of the player playing strategy s when confronted with a 
player playing strategy t. When we are referring to a specific entry in the payoff matrix 
of F, we will use the notation F(i, j) to denote the entry in the ith row and jth column. 
Also, if s is a mixed strategy, we let s; denote the probability that the pure strategy i 
is played. (Thus we will use s and t to denote mixed strategies, and i and j to denote 
indices into these mixed strategies, as well as indices into the payoff matrix F.) 
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The reduction from a graph G = (V, E£) toa payoff matrix F that we consider works 
as follows. 


¢ forl<ifAj<n: FU, j)=1if(@, Jj) € Eand FG, j)=0if @, j/)¢ EF 
¢ forl <i <a: FG,i) =1/2 
¢ for0 <i <n: F(0,i) = FG, 0) = 1 — 1/(2k) 


To show that F has an ESS if and only if the size of the largest clique in G its not equal 
to k, we will need the following technical lemma. 


Lemma 29.3 /f's is a strategy with sy = 0, then F(s|s) < 1 — 1/(2k’), where 
k’ is the size of the maximum clique in G. This holds with equality if and only if s 
is the uniform distribution over a k'-clique. 


PROOF The proof is by induction on the number of nonedges between the 
vertices in G = (V, E) corresponding to elements of the support set of s. The base 
case is when there are 0 such non-edges, which means the vertices corresponding 
to the support set of s form a k”-clique, where k” < k. We assume, without loss 


of generality, that the vertices in the k”-clique are numbered 1, 2,..., k”. 
F(s|s) = oS ye sis; Fi, j) 

ie[k”] je[k”] 

=> Yay- Dan 
ie[k"] jetk”] ie[k”] 

=a Dy-2 De 
jek]  je[k’] i(k] 

<1—1/(2k") 


The last inequality comes from the fact that when ||s||; = 1, ||s||2 is minimized, 
and the inequality is tight, only when all of the components of s are equal. 
Conversely, if s is the uniform distribution over a k’-clique then, the inequality is 
tight, which is shown as follows, 


ye x ea ie) eat a me Ne 2 Ca 


ie[k’] jelk’] ie[k’] jelk’] 
= 1/k?[k? —k’/2] 
= 1—1/(2k’). 


For the inductive step, let uw and v be two vertices such that (u,v) ¢ E. We 
construct a new strategy s’ by moving the probability from v to uv. So let s/ = s, + 
sy and s\, = 0, and let the rest of the values of s’ be identical to those of s. Since v is 
no longer in the support set of s, we can use the induction hypothesis to conclude 
that F(s’|s’) < 1 —1/(2k’). Let p = Vay wer Sw and let g = Dig wer Sw, and 
without loss of generality assume that p > qg. By writing out the expressions 
for F(s’|s’) and F(s|s) one can show F(s'|s’) = F(s|s) + 2sy(p — q) + SuSy > 
F(s|s). Thus, F(s|s) < 1 — 1/(2k’), which proves the inductive step. 
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Now we will use this lemma to prove the necessary properties of the reduction. The 
next two lemmas, when taken together, show that if the maximum size clique in G has 
size not equal to k, then F has an ESS. 


Lemma 29.4 = [fC is a maximal clique in G of size k' > k, and s is the uniform 
distribution on C, then s is an ESS. 


PROOF By Lemma 29.3, F(s|s) = 1 — 1/(2k’). By the construction of the pay- 
off matrix F, F(O|s) = 1 — 1/(2k) < F(s|s). Also, for any u ¢ C, u is connected 
to at most k’ — 1 vertices in C, thus F(u|s) < 1—1/k’ < F(s|s). Thus any best 
response to s must have support only over C. Furthermore, by Lemma 29.3 the 
payoff of s against s is maximized when s is the uniform distribution over C. Thus, 
s is a best response to itself. To prove that s is an ESS, it remains to show that for 
all t #5, that are best responses, to s, F(s|t) > F(t|t). Again by Lemma 29.3, 
F(t\t) < 1 — 1/(2k’). Since C is a clique and s and ¢ are distributions with sup- 
port over C, using the structure of F one can compute that F(s|t) = 1 — 1/(2k’). 
Thus, F(s|t) > F(t|t) and s is an ESS. 


Lemma 29.5 = If the maximum size clique in G is of size k’ < k then the pure 
strategy 0 is an ESS. 


PROOF For any mutant strategy t, F(t|0) = 1 — 1/(2k) = F(O|O), thus 0 is a 
best response to itself. Next, we show that for any ¢ not equal to the pure strategy 
0, F(O|t) > F(t|t). To do so, we first show that we can assume that ¢ places no 
weight on the pure strategy 0. Let ¢* be the strategy t with the probability of 
playing the pure strategy 0 set to the value 0 and then renormalized. So, tj = 0 
and for i 4 0, t* =¢;/(1 — to). By writing out the expressions for F(t|t) and 
F(t*|t*), one can show F(t|t) = (2t — #6)(. — 1/(2k)) + 1 = 20 + 2) F(t"). 
Since F(O|t) = 1—1/(2k), F(O|t) > F(t|t) if and only if F(O|t) > F(t*|t*). 
Next, since the maximum size clique in G has size k’ < k, applying Lemma 29.3 
gives F(t*|t*) < 1— 1/@k’) < 1 -—1/(2k) = F(O|p). 


The next two lemmas, when combined, show that if the maximum size clique in G 
has size exactly k, then F has no ESS. 


Lemma 29.6 = [fthe maximum size clique of G is at least k, then the pure strategy 
0 is not an ESS. 


PROOF Since F(0|0) = F(t|0) = 1 — 1/(2k) for any strategy f, the pure strategy 
0 is a best response to itself. But, if ¢ is the uniform distribution on the maximum 
clique of G, which has size k’ > k, then by Lemma 29.3 F(t|t) = 1 — 1/(2k’) = 
F(O|t). By Theorem 29.2, this means the pure strategy 0 cannot be an ESS. 


Lemma 29.7 = If the maximum size clique of G is at most k, then any strategy 
for F that is not equal to the pure strategy 0, is not an ESS for F. 
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The proof of this lemma uses techniques similar to those used in Lemmas 29.5 
and 29.6, so we leave it as an exercise for the reader (see Section 29.6). 
Taking Lemmas 29.4, 29.5, 29.6, and 29.7 together, we get the following theorem. 


Theorem 29.8 Given a 2-player, symmetric game F computing whether or not 
F has an ESS is both NP-hard and coNP-hard. 


Combining Lemmas 29.5 and 29.6 shows that it is coNP-hard to check whether a given 
strategy is an ESS or not. 


Theorem 29.9 = Given a 2-player, symmetric game F and a strategy s, it is 
coNP-hard to compute whether or not s in an ESS of F. 


PROOF Lemmas 29.5 and 29.6 imply that G has maximum clique of size less 
than k if and only if the pure strategy 0 is an ESS of F. Since the problem 
of determining whether a graph has a maximum clique of size less than k is 
coNP-hard, the problem of recognizing an ESS is also coNP-hard. 


Theorems 29.8 and 29.9 imply that there exist games for which, in all likelihood, 
efficient algorithms for finding and recognizing ESS do not exist. These results are 
important because if finding an ESS for a given class of games is NP-hard, it is unlikely 
that a finite population obeying some simple dynamic will quickly converge to it. But, 
this observation does not mean that one should avoid using models based on ESS. It 
simply means that to ensure the plausibility of a finite population model, one should 
check whether it is computationally tractable to find the ESS of the games the model 
considers. Moreover, this result does not directly imply that an infinite population, 
however, cannot quickly converge to an equilibrium. In fact, the next section explores 
the convergence time of an infinite population to an equilibrium. 


29.3 Evolutionary Dynamics Applied to Selfish Routing 


In this section we describe a method for applying evolutionary dynamics to the problem 
of selfish routing. The model will consider an infinite population of agents, each of 
which carries an infinitesimally small amount of flow in a network. The agents actions 
allow them to change the path that they traverse; however, agents will not be allowed 
to change their paths arbitrarily. The space of actions available to these agents will be 
governed by simple, imitative dynamics. We show how agents selfishly seeking out 
low latency paths, while obeying these imitative dynamics, converge to an approximate 
equilibrium. First, we will formally describe the model which is similar to the nonatomic 
selfish routing model shown in Section 18.2.1. Then, we will briefly outline a technique 
that shows, in the limit, these dynamics converge to an equilibrium. Finally, we will 
analyze the time of convergence to an approximate equilibrium. 


29.3.1 The Selfish Routing Model with Imitative Dynamics 


Let G = (V, E) be a network with latency functions /,: [0, 1] — % defined over each 
edge. We assume the latency functions are nonnegative, nondecreasing, and Lipschitz 
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continuous. We also assume that there is one unit of flow that is to be routed from a 
source s to a sink t, and we let P denote the set of s-t paths in G. We also assume 
that there are infinitely many agents, each of which carries an infinitesimally small 
amount of flow. Let x, denote the fraction of flow that is being routed over path p. 
Thus the vector x, which is indexed by the paths in P, will describe the flow over 
G at a given point in time. A flow x is feasible if it routes 1 unit of flow from s 
tot: Leta SS pae*p be the total Joad of an edge. The total /atency of an edge is 
denoted /,(x,) and the total latency of a path is the sum of the latencies of the edges 
in the path, / p(X) = >... /e(x-). Finally, the average latency of the entire network is 
i= Deen Xpl p(X). 

Initially each agent is assumed to play an arbitrary pure strategy. Then at each 
point in time, each agent is randomly paired with another agent and they compare 
the latencies of their paths. If the latency of one agent’s path is less than the latency 
of the other agent’s path, the agent experiencing higher latency switches to the lower 
latency path with probability proportional to the difference in latencies. These imitative 
dynamics model a source node gathering statistics on how long it takes for its packets 
to reach the destination and changing the route accordingly. In Section 29.3.2 we will 
describe why these dynamics will continue until the agents reach a Nash flow (also 
called Wardrop equilibrium), which is a pure strategy Nash equilibrium for this routing 
game, that we define next. 


cep 


Definition 29.10 A feasible flow x is a Nash flow if and only if for all p, p’ € P 
with x, > 0, 1,(%) < 1p(2). 


This definition ensures that, at a Nash flow, all s—t paths have the same latency (this is 
precisely Definition 18.1 when restricted to the single commodity case). If we further re- 
strict the latency functions to be strictly increasing, then Nash flows are essentially ESS. 
We omit the proof of this since this section focuses on the convergence of the imitative 
dynamics (we refer the interested reader to Section 29.6 for the appropriate references). 

To analyze the convergence of these dynamics to either a Nash flow or an approx- 
imate equilibrium, it is necessary to compute the rate of change of the amount of 
flow over each path. Throughout this section we will use the notation x’ to denote the 
derivative with respect to time of the variable x, that is, x’ = dx/dt. The following set 
of differential equations describe the rate of change of the flow over each path. 


t= a SS  eAQ@ILG)=L0) 


qe Pil, (X)<1 p(X) 


+ 5 xpxgAG Lg) — 1p)! (29.1) 


qe Pilg(X)>1)(X) 


= S > xpXq M(H )[lg(&) ao L,(x)] 


qeP 


= A@ ky |e tah) = GG) x 


qgeP qeP 
= A) xpll(X) — 1p] (29.2) 
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In this derivation, the function A accounts for normalizing factors so that the probabili- 
ties are bounded above by 1, and it accounts for the rate at which organisms are paired. 
The first summation in Equation 29.1 represents the expected number of agents that 
switch from path p to lower latency paths. The probability than an agent on path p is 
paired with an agent of path g is equal to the fraction of agents using g, which is xj. 
Then the agent using p would switch to g with probability /,() — J,(x). Multiplying 
this product by x, gives the expected number of agents moving from p to a lower la- 
tency path q. Similarly, the second summation of Equation 29.1 represents the number 
of agents that switch to path p from a higher latency path. The rest of the derivation 
results from straightforward algebraic manipulations. 

Intuitively, Equation 29.2 says that paths with below average latency will have more 
agents switching to them than from them; paths with above average latency will have 
more agents switching from them than to them. In Section 29.3.3, where we bound 
the time it takes for the system to converge to an approximate equilibrium, we would 
like the rate of change of the population to be independent of the scale of the latency 
functions. Thus we will replace A(x) by /(x)~! to give a relative rate of change. 

While these equations resulted from imitative dynamics, the same equations can be 
derived from a type of replication dynamic. In the literature, these equations are often 
called the replicator dynamics. Now that we have defined the model and the dynamics, 
we will show that the population of agents using imitative dynamics will converge to 
an approximate equilibrium. 


29.3.2 Convergence to Nash Flow 


It has been shown that as time goes to infinity, any initial flow that has support over 
all paths in P will eventually converge to a Nash flow. In this section we give an 
overview of the technique used to prove this. It is not clear how these techniques 
could yield a bound on the time to convergence, so we do not go into specific details 
of the proof. Since this text is focused on algorithmic game theory, we shall instead 
give more attention to another result, shown in Section 29.3.3, that bounds the time of 
convergence to an approximate equilibrium. 

The main vehicle for proving that imitative dynamics converge to a Nash flow is 
Lyapunov’s direct method. This is a general framework for proving that a system of 
differential equations converges to a stable point, without necessarily knowing how 
to solve the system of differential equations. Intuitively, this method works by first 
defining a real valued potential function ® that measures the potential energy of the 
system of differential equations. The direct method requires that ® be defined around 
a neighborhood of a stable point and vanish at the stable point itself. Then, if one can 
show that the dynamics of the system cause the potential function to decrease with 
respect to time (along with a few other technical properties of the potential function), 
Lyapunov’s theorems will imply that if the system reaches the neighborhood of the 
stable point, the system will converge to the stable point. One drawback to this method 
is that it provides no guidance for choosing such a potential function. 

The argument that applies this method to the system of differential equations de- 
scribed in Equation 29.2 works as follows. First, define ® over the current flow such 
that it will measure the total amount of latency the agents are experiencing. We will 
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define just such a function in the next section. Then, show that the imitative dynam- 
ics cause ® to decrease over time, and that ® will achieve its minimum value at a 
Nash flow. Applying one of the theorems in the Lyapunov’s framework allows one to 
conclude that if the dynamics ever reach a neighborhood of an equilibrium, they will 
converge to it. Finally, one has to show this neighborhood of convergence contains any 
initial, feasible flow with support over all paths in P. This comes from the fact that the 
dynamics cause the potential of any nonequilibrium flow to decrease and thus move 
toward an equilibrium. Thus, in this model of selfish routing with imitative dynamics, 
the Lyapunov framework allows one to show that the system will not get stuck in any 
local minima and will converge to global minimum from any initial state with support 
over all paths in P. 


29.3.3 Convergence to Approximate Equilibrium 


In this section we will give a bound on how long it takes for the population of agents 
using imitative dynamics to come to an approximate equilibrium. 

One might consider using Euclidean distance between the current flow and an 
equilibrium flow as a measure of approximation. To see intuitively why this is not a 
suitable metric, consider a network and a flow where an € fraction of the agents uses a 
path p, which has a latency that is slightly less than the current average latency. If it 
were essential for an equilibrium to have a large fraction of the population using p, we 
could take € to be arbitrarily small, which, by Equation 29.2, means we could make 
x, arbitrarily small. Thus the imitative dynamics would cause the population to move 
arbitrarily slowly to p, and therefore it would take arbitrarily long for the population 
to approach, in Euclidean distance, a Nash flow. Thus, we define an €-approximate 
equilibrium next. 


Definition 29.11 Let P. be the paths that have latency at least (1 + €)/, that is 
Pe={peP| 1,(X) > (1+ )/}, and let x. = pee X» be the fraction of agents 
using these paths. A population x is said to be at an ¢-approximate equilibrium if 
and only if x. <€. 


This definition ensures at such an equilibrium that only a small fraction of agents expe- 
rience latency significantly worse than the average latency. In contrast, the definition of 
a Nash flow requires that all agents experience the same latency (see Definition 29.10). 

To prove the convergence of these imitative dynamics to an approximate equilibrium, 
we will make use of the following potential function. This function is one way to 
measure the total amount of latency the agents experience. 


Xe 
Ox) =+) > i le(u)du (29.3) 
ecE 0 
The integral sums the latency each agent that traverses edge e would experience if the 
agents were inserted one at a time. Summing this over each edge gives the total latency 
that each agent would experience if they were entered into the network one at a time. 
The term /* denotes the minimum average latency of a feasible flow, /* = min; /. We 
add this term as a technicality that will help prove our bounds on the time convergence 
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to approximate equilibrium. With the exception of the /* term, this is the same potential 
function described in Equation 18.3. 


Theorem 29.12 The imitative dynamics converge to an €-approximate equilib- 
rium within time O(e7? In(Imax/1*)). 


This proof works by analyzing the rate of change of ® under the imitative dynamics. 
If the current flow is not at an €-approximate equilibrium, we can lower bound the 
absolute rate of change of ® in terms of 7. We then lower bound / in terms of ®, 
resulting in a differential inequality. Solving it leads to an upper bound on the time it 
takes for ® reach an approximate equilibrium. 


PROOF Westart by computing the derivative with respect to time of the potential 
function ®. 


P= ee) = > yx eG) 
ecE ecE poe 


Next we substitute in the imitative dynamics, given by Equation 29.2. After 
that we simplify the expression with the aim of using Jensen’s inequality. 


& = >) AR xplIR) — L@Melxe) 


ecE pre 
= M%) D7 D2 xpll&) — 1pH Mele) 
peP eep 
= A) >| xpll(%) — 1p @)lpp) 
peP 
= i(X) (1 > 2G) — >, Xplp (3) 
peP peP 
= nay -» Xplp(3)) (29.4) 
peP 


Jensen’s inequality shows that this equation is bounded above by 0. 

We would like to upper bound ©’. To do so, first observe as long as x is 
not at an €-approximate equilibrium, by definition at least an € fraction of the 
population experiences latency at least (1 + €)/(x). Jensen’s inequality also shows 
that for a fixed value of /(x), the }~ pep * pl Say term is minimized when the less 
expensive paths all have equal latency which we denote /’. Thus, for the purposes 
of upper bounding ©’, we assume / = e(1 + €)/ + (1 — €)/’. Plugging this into 
Equation 29.4 gives 


®' < A(x)IGY — (€((1 + IG)” + (1 - ©”). 
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; : Fle : oF lanl 
Now we substitute in J’ = jee and perform some arithmetic giving, 
3 


®! < —)(X) 1x)? 


€ 
l-e« 
ee 
< AGF). 


We also replace A(x) with /(x)~! to measure the relative rate of change of ® under 
the imitative dynamics, 


ee 
OIG): (29.5) 


We can bound / from below by ®/2 in the following way, 


1S Oe 


peP peP eep 
= ys ye eles) = SSG) 
ecE pre ecE 
> / “L(w)du. (29.6) 


The inequality holds because of the assumed monotonicity of the latency func- 
tions. Now by the definition of /*, it is easy to see that / > /*. Combining this 
fact with Equation 29.6, we get that ?+/>1* + °c fo‘ (udu = ©. Thus 
1 > ®/2. Substituting this into Inequality 29.5, we get the following differential 
inequality, 


® < — 6/4. 


It can be shown via standard methods that any function of the following form 
is a solution to the above inequality, 


Dt) < O(O)e8 4, 


Here (0) is given by the initial boundary conditions. Recall that this inequality 
only holds as long as Xx is not an €-approximate equilibrium. Thus, x must reach 
an €-approximate equilibrium when ® reaches its minimum, ®*, at the latest. So 
we find the smallest t such that B(t) < ©*, 


It is easy to see that b* > /* and ®(O) < 2lmax, which proves the theorem. 


29.4 Evolutionary Game Theory over Graphs 


Next, we will consider a model similar to the classical model of evolutionary game 
theory described in Section 29.1, but we will no longer assume that two organisms are 
chosen uniformly at random to interact. Instead, we assume that organisms interact only 
with those in their local neighborhood, as defined by an undirected graph or network. 
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As in the classical setting (which can be viewed as the special case of the complete 
network or clique), we shall assume an infinite population, by which we mean we 
examine limiting behavior in a family of graphs of increasing size. 

Before giving formal definitions, some comments are in order on what to expect 
in moving from the classical to the graph-theoretic setting. In the classical (complete 
graph) setting, there exist many symmetries that may be broken in moving to the 
network setting, at both the group and individual level. Indeed, such asymmetries are 
the primary interest in examining a graph-theoretic generalization. 

For example, at the group level, in the standard ESS definition, one need not discuss 
any particular set of mutants of population fraction €. Since all organisms are equally 
likely to interact, the survival or fate of any specific mutant set is identical to that of any 
other. In the network setting, this may not be true: some mutant sets may be better able 
to survive than others due to the specific topologies of their interactions in the network. 
For instance, foreshadowing some of our analysis, if s is an ESS but F(t|t) is much 
larger than F(s|s) and F(s|t), a mutant set with a great deal of “internal” interaction 
(i.e., edges between mutants) may be able to survive, whereas one without this may 
suffer. At the level of individuals, in the classical setting, the assertion that one mutant 
dies implies that all mutants die, again by symmetry. In the network setting, individual 
fates may differ within a group all playing acommon strategy. These observations imply 
that in examining ESS on networks we face definitional choices that were obscured in 
the classical model. 

If G is a graph representing the allowed pairwise interactions between organisms 
(vertices), and u is a vertex of G playing strategy s,,, then the fitness of u is given by 


vert) & Sulu) 


F — 
“ IF@I 


Here sy is the strategy being played by the neighbor v, andI'(u) = {v € V: (u, v) € E}. 
One can view the fitness of u as the average fitness u would obtain if it played each of 
its neighbors, or the expected fitness uw would obtain if it were assigned to play one of 
its neighbors chosen uniformly at random. 

Classical evolutionary game theory examines an infinite, symmetric population. 
Graphs or networks are inherently finite objects, and we are specifically interested in 
their asymmetries, as discussed above. Thus all of our definitions shall revolve around 
an infinite family G = {G,,}° , of finite graphs G,, over n vertices, but we shall examine 
asymptotic (large n) properties of such families. 

We first give a definition for a family of mutant vertex sets in such an infinite graph 
family to contract. 


Definition 29.13 Let G = {G,,}°°, be an infinite family of graphs, where G,, 
has n vertices. Let M = {M,,}"°, be any family of subsets of vertices of the G, 
such that |M,,| > €n for some constant € > 0. Suppose all the vertices of M,, play 
a common (mutant) strategy t, and suppose the remaining vertices in G, play 
a common (incumbent) strategy s. We say that M,, contracts if for sufficiently 
large n, for all but o(n) of the 7 € M,, j has an incumbent neighbor i such that 
Fj) < FO). 
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A reasonable alternative would be to ask that the condition above holds for all 
mutants rather than all but o(m). Note also that we only require that a mutant have 
one incumbent neighbor of higher fitness in order to die; one might consider requiring 
more. In Section 29.6 we ask the reader to consider one of these stronger conditions 
and demonstrate that our results can no longer hold. 

To properly define an ESS for an infinite family of finite graphs in a way that recovers 
the classical definition asymptotically in the case of the family of complete graphs, we 
first must give a definition that restricts attention to families of mutant vertices that 
are smaller than some invasion threshold e’n, yet remain some constant fraction of the 
population. This prevents “invasions” that survive merely by constituting a vanishing 
fraction of the population. 


Definition 29.14 Let «’ > 0, and let G = {G,,}°° be an infinite family of 
graphs, where G,, has n vertices. Let M = {M,,}°°.) be any family of (mutant) 
vertices in G,. We say that M is e€’-linear if there exists an €, «’ > € > 0, such 
that for all sufficiently large n, e’n > |M,| > en. 


We can now give our definition for a strategy to be evolutionarily stable when 
employed by organisms interacting with their neighborhood in a graph. 


Definition 29.15 Let G = {G,,}",, be an infinite family of graphs, where G, 
has n vertices. Let F be any 2-player, symmetric game for which s is a strategy. 
We say that s is an ESS with respect to F and G if for all mutant strategies 
t #5, there exists an €, > 0 such that for any €;-linear family of mutant vertices 
M = {My} all playing f, for n sufficiently large, M,, contracts. 


Thus, to violate the ESS property for G, one must witness a family of mutations M in 
which each M,, is an arbitrarily small but nonzero constant fraction of the population of 
G,,, but does not contract (i.e., every mutant set has a subset of linear size that survives 
all of its incumbent interactions). One can show that the definition given coincides with 
the classical one in the case where G is the family of complete graphs, in the limit of 
large n. We note that even in the classical model, small sets of mutants were allowed 
to have greater fitness than the incumbents, as long as the size of the set was o(n). 

In the definition above there are three parameters: the game F’, the graph family G, 
and the mutation family M. Our main results will hold for any 2-player, symmetric 
game F’. We will study a rather general setting for G and M: that in which G is a family 
of random graphs and M is arbitrary. We will see that, subject to conditions on degree 
or edge density (essentially forcing connectivity of G but not much more), for any 2- 
player, symmetric game, the ESS of the classical settings, and only those strategies, are 
always preserved. Thus, for the purposes of characterizing stable strategies, the classical 
method of pairing organisms at random, is equivalent to randomizing the graph. 


29.4.1 Random Graphs, Adversarial Mutations 


We now proceed to state and prove the random graph result in the network ESS model. 
We consider a setting in which the graphs are generated via the G,,,, model of Erdés and 
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Rényi. In this model, every pair of vertices is joined by an edge independently and with 
probability p (where p may depend on n). The mutant set, however, will be constructed 
adversarially (subject to the linear size constraint given by Definition 29.15). For this 
setting, we show that for any 2-player, symmetric game, s is a classical ESS of that 
game, if and only if s is an ESS for {G,,,}"2, where p = Q(1/n‘°) and0 < c < 1, and 
any mutant family {M,,}°°, where each M,, has linear size. We note that under these 
settings, if we let c= 1 — y for small y > 0, the expected number of edges in G,, is 
n'*Y or larger — that is, just superlinear in the number of vertices and potentially far 
smaller than O(n). It is easy to convince oneself that once the graphs have only a linear 
number of edges, we are flirting with disconnectedness, and there may simply be large 
mutant sets that can survive in isolation due to the lack of any incumbent interactions 
in certain games. Thus in some sense we examine the minimum plausible edge density. 


Theorem 29.16 Let F be any 2-player, symmetric game, and suppose s is a 
classical ESS of F. Let the infinite graph family G = {Gy}p°., be drawn according 
to Gn,p, where p = Q(1/n°) andO < c < 1. Then with probability 1, s is an ESS 
with respect to F and G. 


A central idea in the proof is to divide mutants into two categories, those with 
“normal” fitness and those with “abnormal” fitness. Normal fitness means within a 
(1 + T) factor of the fitness given by the classical model, where t is a small constant 
greater than 0, and abnormal fitness means outside of that range. We will use the lemma 
below (provided without proof) to bound the number of incumbents and mutants of 
abnormal fitness. 


Lemma 29.17 — For almost every graph Gy,» with (1 — €)n incumbents, all but 
2tpan incumbents have fitness in the range (1 + T)[( — €)F(s|s) + € F(s|t)], 
where p = Q(1/n‘) and €, t and c are constants satisfying) <€ <1,0<T< 
1/6, 0 <c <1. Similarly, under the same assumptions, all but 2tpen mutants 
have fitness in the range (1 + t)[(1 — €) F(t|s) +e FID]. 


With this lemma we first show that all but o(m) of the population (incumbent or 
mutant) have an incumbent neighbor of normal fitness. This will imply that all but o(7) 
of the mutants of normal fitness have an incumbent neighbor of higher fitness. The 
vehicle for proving this is the following result from random graph theory, which gives 
an upper bound on the number of vertices not connected to a sufficiently large set, U. 


Theorem 29.18 Suppose 6 = 5(n) and C = C(n) satisfy pn = 3logn, C = 
3 log(e/5), and Cdn — oo. Then almost every Gy,, is such that for every U C 
V,|U| =u=[C/p] the set T, = {x € V\ U | TX) NU = } has at most bn 
elements. 


This theorem assumes that the size of this large set U is known with equality, which 
necessitates the union bound argument below. The second main step of the proof uses 
Lemma 29.17 again, to show that there can be at most o(m) mutants with abnormal 
fitness. Since there are so few of them, even if none of them have an incumbent neighbor 
of higher fitness, s will still be an ESS with respect to F and G. 
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PROOF (Sketch) Let t #5 be the mutant strategy. Since s is a classical ESS, 
there exists an €, such that (1 — €)F(s|s) + €F(s|t) > (I — €)F(t|s) + €F(t|1), 
for all 0 < € < €,. Let M be any mutant family that is €,-linear. Thus for any fixed 
value of n that is sufficiently large, there exists an € such that |M,,| = en and €, > 
€ > 0. Also, let J, = V,\M, and let I’ C I, be the set of incumbents that have 
fitness in the range (1 + t)[(1 — €) F(s|s) + € F(s|t)] for some constant t, 0 < 
t < 1/6. Lemma 29.17 shows (1 — €)n = |[J'| > 1 —6)n — re . Finally, let 


Ty ={x EV\ I |T@)NI 4B}. 


(For the sake of clarity we suppress the subscript n on the sets I’ and T.) The 
union bound gives us 


(1-e)n 
Pr(|Ty| > 6n) < x, Pr(|Ty| > énand|/'| =i). (29.7) 


i=(1—©)n— 24 logn 


Letting 6 =n” for some y > 0 gives 6n = o(n). We will apply Theo- 
rem 29.18 to the summand on the-right hand side of Equation 29.7. If we let 
y = (1 —c)/2, and combine this with the fact that 0 < c < 1, all of the require- 
ments of this theorem will be satisfied (details omitted). Now when we apply this 
theorem to Equation 29.7, we get 


(l—e)n 1 
Pr(|7;| > 5n) < y>  exp( —=Cén (29.8) 
i=(1—-€)n— are 6 
= o(1). 


24 “ae n 


This is because Equation 29.8 has only terms, and Theorem 29.18 gives 


us that C > (1 — e)n!~° Hie ”. Thus we ase shown, with probability tending 
to 1 as n — ov, at most o(n) individuals are not attached to an incumbent which 
has fitness in the range (1 + t)[(1 — €)F(s|s) + €F(s|t)]. This implies that the 
number of mutants of approximately normal fitness, not attached to an incumbent 
of approximately normal fitness, is also o(7). 

Now those mutants of approximately normal fitness that are attached to an 
incumbent of approximately normal fitness have fitness in the range (1 + t)[(1 — 
€)F(t|s) + €F(t|t)]. The incumbents that they are attached to have fitness in the 
range (1+ 7)[(1 — €)F(s|s) + € F(s|t)]. Since s is an ESS of F, we know (1 — 
€)F(s|s) + €F(s|t) > ( — ©) F(t|s) + € F(t|t), thus if we choose t small enough, 
we can ensure that all but o(m) mutants of normal fitness have a neighboring 
incumbent of higher fitness. 

Finally by Lemma 29.17, we know that there are at most o(n) mutants of 
abnormal fitness. So even if all of them are more fit than their respective incumbent 
neighbors, we have shown all but o() of the mutants have an incumbent neighbor 
of higher fitness. 


Next we briefly outline how to prove a converse to Theorem 29.16. Observe that if 
in the statement of Theorem 29.16 we let c = 0, then p = 1, which in turn, makes G = 
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{K,}°9, where K,, is a clique of n vertices. Then for any K,, all of the incumbents will 
have identical fitness and all of the mutants will have identical fitness. Furthermore, if s 
is an ESS for G, the incumbent fitness will be higher than the mutant fitness. Finally, one 
can show that as n —> oo, the incumbent fitness converges to (1 — €)F(s|s) + € F(s|t), 
and the mutant fitness converges to (1 — €)F(t|s) + € F(t|t). In other words, s must be 
a classical ESS, providing a converse to Theorem 29.16. 


29.5 Future Work 


Most evolutionary game-theoretic models consider an infinite population of agents. 
These agents usually obey some simple dynamic such as imitation or replication. 
Typical results in these models show that in the limit (as time goes to infinity) the 
population converges to an equilibrium. A major open problem in the intersection of 
evolutionary game theory and theoretical computer science is to analyze a population 
of n agents, who obey one of these dynamics, and bound the time of convergence to an 
equilibrium. The notions of equilibrium and stability might have to be adapted to this 
new finite setting. Results along these lines would yield simple, distributed algorithms 
that agents could implement and converge to an equilibrium in a bounded (and hopefully 
short) amount of time. This would provide contribution beyond proving the existence 
of equilibria, and beyond showing that an infinite population will eventually converge 
to it. It will show that a population of a given size will converge to a stable equilibrium 
within a certain amount of time. 

To start on this endeavor, the simplest models could consider n agents, where each 
agent could interact with each other agent. One example of such a problem would be to 
analyze a selfish routing model, such as the one described in Section 29.3, except with 
n agents, as opposed to infinitely many, and show a strongly polynomial time bound 
for their convergence. After baseline models such as this have been developed and 
studied, one might then try to find dynamics that result in these agents converging to an 
equilibrium that maximizes an appropriate notion of social welfare. Another extension 
would be to consider models where agents are arranged in a graph and can only interact 
with agents in their local neighborhood. One could then analyze not only the effect of 
the graph topology on equilibrium, as was done in Section 29.4, but also how it affects 
the convergence time. 

It may turn out that hardness results stand in the way of such progress. Then one 
could try to bound the time of convergence to an approximate equilibrium, or simply 
bound the amount of time the population spends far away from an equilibrium. Also 
results such as the one given in Section 29.2 imply that there exist games for which it is 
hard to compute equilibria. There still could be many well-motivated classes of games 
for which arriving at an equilibrium is computationally tractable. 


29.6 Notes 


The motivation for evolutionary game theory and the description of the model, defini- 
tions, and dynamics were inspired by Smith (1982), Osborne and Rubinstein (1994), 
Weibull (1995), Hofbauer and Sigmund (1998), Kontogiannis and Spirakis (2005), 
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and Kearns and Suri (2006). The Hawks and Doves game and its motivation come 
from Smith (1982), Osborne and Rubinstein (1994), Weibull (1995), and Alexander 
(2003). 

The section on the computational complexity of ESS comes from Nisan (2006), 
which extended work by Etessami and Lochbihler (2004). Lemma 29.3 is a slight 
modification of a lemma in Motzkin and Straus (1965). Papadimitriou and Yannakakis 
(1982) show the problem of determining whether or not a graph has a maximum clique 
of size k is coD?-hard. We will not define the complexity class coD” here, but simply 
state that it contains both NP and coNP. Etessami and Lochbihler (2004) show that 
finding a strategy that is close in £, norm to and ESS takes super-polynomial time 
unless P=NP. They also show that finding an ESS is in ©”, and that finding a regular 
ESS is NP-complete. In addition, they prove that counting the number of ESS and 
counting the number of regular ESS are both #P-hard. 

Most of Section 29.3 comes from Fischer and Vécking (2004) and Fischer (2005). 
For more details regarding the convergence of the imitative dynamics to a Nash flow, 
see those two references. We refer the reader to Brauer and Nohel (1969) for an 
excellent introduction into the Lyapunov framework. For a more extensive and technical 
treatment see Bhatia and Szeg6é (1970). For applications of the Lyapunov framework 
to other evolutionary game theoretic models and dynamics, see Weibull (1995) and 
Hofbauer and Sigmund (1998). There are many other places where evolutionary game 
theory is studied in conjunction with imitative dynamics, for example see Bjérnerstedt 
and Schlag (1996) and Schlag (1998) and chapter 4 of Weibull (1995). 

There is a nice sequence of papers that continues the work of Fischer and Voécking 
(2004) shown in Section 29.3. Fischer and Vécking (2005) consider a similar model 
where agents may have stale information regarding the latencies of other paths. 
Fischer et al. (2006) consider a model where agents switch paths in a round based 
fashion. 

Section 29.4 comes from Kearns and Suri (2006) . Vickery (1987) first noticed that 
a constant number of mutants may have higher fitness than the incumbents who are 
playing an ESS. Theorem 29.18 is Theorem 2.15 from Bollobas (2001) . In Kearns and 
Suri (2006), the authors give a pair of results dual to Theorem 29.16 and its converse. 
They show that if the graph is chosen adversarially, subject to some density restrictions, 
and the mutants are chosen randomly then ESS are preserved. 
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Exercises 


29.1 Find the ESS of Prisoners Dilemma. 


29.2 In the game of Hawks and Doves, given by Figure 29.1, if V < C, show that V/C 
is a mixed strategy ESS. (Hint: Use the fact that for any mixed Nash equilibrium, s* 
with support 51, 52, ..., Sx, F (s1|5*) = F (sg|s*) = --- = F (sxls*) = F(s*|s*)). 

29.3 Consider a 2 x 2-symmetric game with four arbitrary constants for payoffs. Char- 
acterize the ESS for such a game in terms of the payoffs. Use this to conclude that 
any 2 x 2-symmetric game has an ESS. 

29.4 Give an example of a game that has a Nash Equilibrium but no ESS. 


29.5 Prove Lemma 29.7. 
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29.6 


29.7 
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Show that }°,-p x, = 0, where x;, is defined by Equation 29.2. Using this, conclude 
that if, in the selfish routing model of Section 29.3, the imitative dynamics initially 


start with a feasible flow, then for all time the flow remains feasible. 


Show that there exists a game such that with high probability for a family of random 
graphs with p = Q(1/n‘) and 0 < c < 1, an adversary can construct a mutant set 
such that there will exist at least one mutant with higher fitness than all of its 
incumbent neighbors. 
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